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(57) Abstract 

This invention discloses a method for generatina a recombinant library by introducing one or more changes within a prede- 
termined region of double-stranded nucleic acid, composing providing a first primer population and a second pnmer population, 
each of the populations having a variable base composition at known positions along the primers, the pnmers incorporating a 
class IIS restriction enzvme recognition sequence, beina capable of directing change in the nucleic acid sequence and being sub- 
stantial complementary to the double-stranded nucleic acid to permit hybridization thereto. "The method additionally comprises 
hvbndizing the first and second primer oopulations to opposite strands of the doubie-stranded nuc:eic ac:d to form a first pair ot 
primer-'empiates oriented in opposite directions, performing enzymatic inverse polymerase chain reaction to generate at least one 
linear copv of the double stranded nucleic acid incorporating the change directed by the primers, cutting the double-stranded 
nucleic acid copv with a class IIS restriction enzvme to form a restricted linear nucleic acid molecule containing the change, join- 
ing termini of the restricted linear nucleic acid molecule to produce double-stranded circular nucieic acid and introducing the 
nucleic acid into compatible host ceils. A method is additionally provided for generating a recombinant library using wobble-oase 
mutagenesis. 
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ENZYMATIC INVERSE POLYMERASE CHAIN REACTION LIBRARY 

MUTAGENESIS 



BACKGROUND OF THE INVENTION 

10 Recombinant DNA techniques have revolutionized molecular 

biology and genetics by permitting the isolation and 
characterization of specific DNA fragments. Of ma j or impact 
has been the exponential amplification of small amounts of DNA 
by a technique known as the polymerase chain reaction (PCR) . 

15 The sensitivity, speed and versatility of PCR makes this 

technique amenable to a wide variety of applications such as 
medical diagnostics, human genetics, forensic science and 
other disciplines of the biological sciences. 

PCR is based on the enzymatic amplification of a DNA 

20 sequence that is flanked by two oligonucleotide primers which 

hybridize to opposite strands of the target sequence. The 
primers are oriented in opposite directions with their 3' ends 
pointing towards each other. Repeated cycles of heat 
denaturation of the template, annealing of the primers to 

25 their complementary sequences and extension of the annealed 

primers with a DNA polymerase result in the amplification of 
the segment defined by the 5* ends of the PCR primers. Since 
the extension product of each primer can serve as a template 
for the other primer, each cycle results in the exponential 

3 0 accumulation of the specific target fragment, up to several 

million fold in a few hours. The method can be used with a 
complex template such as genomic DNA and can amplify a single- 
copy gene contained therein. It is also capable of amplifying 
a single molecule of target DNA in a complex mixture of RNAs 
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cr DNAs and can, under seme conditions, produce fragments up 
to ten kb long. The ?CR technology is the subject satrer of 
united States Patent Nos. 4,683,135, 4,300,159, 4,754,065, and 
4,533,202 all of which are incorporated herein by reference- 
In addition to the use of PCR for amplifying target 
sequences, this method has also been used to generate site- 
specific mutations in Known sequences. Mutations are created 
by introducing mismatches into the oligonucleotide primers 
used in the PCR amplification. The oligonucleotides, with 
their mutant sequences, are then incorporated at both ends of 
the linear PCR product. In addition to their mutated 
sequences, the primers often contain restriction enzyme 
recognition sequences which are used for sub-cloning the 
mutated linear DNAs into vectors in place of the wild type 
sequences. Although this procedure is relatively simple to 
perform, its applications are limited because appropriate 
restriction sequences are not always conveniently located for 
substituting the mutant sequence with the wild- type sequence - 
Restriction sequences can be incorporated into the wild-type 
sequences for subcloning. However, such extraneous sequences 
can cause detrimental effects to the function of the gene cr 
resulting gene product. Moreover, PCR products typically 
contain heterogeneous termini resulting from the addition of 
extra nucleotides and/or incomplete extension of the primer- 
templates. Such termini are extremely difficult to iigate and 
therefore result in a low subcloning efficiency. 

Several modifications of the PCR-based site-directed 
mutagenesis strategies have been developed to circumvent such 
limitations, but they too have undesirable features. The most 
prominent undesirable feature exhibited by these alternative 
methods is 'a low frequency of correct mutations. For example, 
inverse PCR (IPCR) is a method which amplifies a circular 
plasmid rather than a linear molecule, Hems ley et al., Nuc. 
Acid. Res. 17:5545-6551 (1939), which is incorporated herein 
by reference. In this technique, two primers which are 
located back to back on opposing DNA strands of a plasmid 
drive the PCR reaction. The resultant PCR product, a linear 
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DNA molecule identical in length to the starting plasmid, 
contains any mutations which were designed into the primers. 
The product is then enzymatically prepared for ligation by 
blunting and phosphoryiating the termini. Enzymatic treatment 
5 c: the termini is a necessary step for ligation due to 

heterogeneous termini associated with PCR products. These 
treatments are likely to be incomplete and cause unwanted 
mutations as well as result in a low ligation and 
transformation efficiency due to the additional required 
10 steps. 

Recombinant circle PCR (RCPCR) , Jones and Howard/ 
BicTechniques 3:178-133 (1990) , and recombination PCR (RPCR) , 
Jones and Howard, BioTechniques 10:62-65 (1991), on the other 
hand, are two methods similar to IPCR which do not require any 
15 enzymatic treatment. In RCPCR , two separate PCR reactions, 

requiring a total of four primers, are needed to generate the 
mutated product. The separate amplification reactions are 
primed at different locations on the same template to generate 
products that when combined, denatured and cross-annealed, 
2 0 form double-stranded DNA with complementary single-strand 

ends. The complementary ends anneal to form DNA circles 
suitable for transformation into E. coli. 

RPCR is a technique that uses PCR primers having a twelve 
base exact match at their 5' ends, resulting in a PCR product 
25 with homologous double-stranded termini. Transformation of 

the linear product into recombination-positive ( recA-pos itive ) 
cells produces a circular plasmid through in vivo 
recombination. Although this method reduces the number of 
steps and primers used compared to RCPCR, the transformation 
30 and recombination of linear molecules is an inefficient 

•process resulting in a correspondingly low mutation frequency. 

A modification of site-directed mutagenesis, random 
mutagenesis, permits the incorporation of random mutations 
into a polynucleotide. Mutant libraries are normally 

35 constructed by the mutagenesis of a small, defined area of a 

plasmid containing the gene or control region of interest. 
Methods for generating mutant libraries typically use 
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synrheric oligonucleotides with random or biased mixtures of 
bases in one or more positions along the oligonucleotide. A 
variety of methods have been used to introduce these mutagenic 
cligcnucieotides into the expression vector. Typically, the 
5 oligonucleotides are hybridized to a substantially 

complementary strand of DNA and a polymerase is used to extend 
the length of the oligonucleotide into a polynucleotide whose 
length is dependant both on the length of the template and on 
the conditions of enzymatic extension. This procedure permirs 
10 the construction of large libraries of mutants having 

mutations in one or more regions of the polynucleotide or 
protein sequence as compared with -he template. From these 
libraries,, the transf ectants or transf ormants can be screened 
for the desired characteristic. However, both random 
15 mutagenesis employing ?CR, and random mutagenesis, in general, 

are restricted in design by the choice of restriction 
endonucieases traditionally employed for these procedures. 
Of "ten random mutagenesis has a relatively low efficiency such 
that a significant number of individual mutations are lost 
2 0 during primer extension and introduction of the polynucleotide 

into the host. Further, mistakes or unintended mutations are 
often incorporated into the sequences resulting in an 
additional decrease in the efficiency. Selected mutations may 
therefore be under or over represented in the library. 
2 5 Thus, a need exists for a PCR-based mutagenesis method 

which allows the rapid and efficient alteration of nucleotide 
sequences to create libraries that are sufficiently diverse. 
The present invention satisfies this need and provides related 
advantages as well. 



30 



BRIEF DESCRIPTION' OF THE DRAWINGS 



Figure 1 is a schematic diagram outlining the steps of 
EIPCR. 

35 Figure 2 shows the design of EIPCR primers. Line A shows 

a region of the ?CR template (SEQ ID NO : 1) and two mutations 
to be made by EIRCR (indicated by small arrows) . Line 3 shows 



BNSCCCD: <WO ?3"!2257Ai i 



WO 93/12257 



PCT/LS92/ 10647 



-3- 



how the primers (SEQ ID NO: 2; SEQ ID NO: 3) relate to the 
mutated product (line C) (SEQ ID NC: 4) . This is not an 
actual reaction intermediate, but is a cartoon to draw when 
designing the primers. The primers are indicated in grey. 
5 The Bsa I recognition sequence ' SEQ ID NC: 5) is uncarlined. 

Four or more bases are added 5 ' to the enzyme recognition 
sequence of each primer to ensure efficient substrate 
recognition by the enzyme- Line c shows the sequence of the 
mutated product. The grey boxes show the parts of the orimer 
10 that have been incorporated into the final product. The 

overhangs of the two DNA ends are indicated, but the 
recognition sequences have been cut off and are net cart cf 
the final product. 

Figure 3 is a list of class IIS restriction enzymes and 
15 the nucleotide sequence of their recognition sequences (SEQ ID 

NOS: 5 through 20). 

Figure 4 is a schematic diagram showing the use of EIPCR 
technology for generating single chain antibodies. Line A 
shows the template region (SEQ ID NO: 21) to be mutagenized to 

2 0 create a linker between heavy and light chain encoding 

sequences. Line 3 shows the EIPCR primer design (SEQ ID NO: 
22; SEQ ID NO: 23) and line C shows the nucleotide (SEQ ID NO: 
24) and amino acid (SEQ ID NO: 25) sequence of an identified, 
active single chain antibody sequence. 
25 Figure 5 is a schematic of the 1.8 kb expression vector 

pMCHAFvl for CHA2 55 Fv fragment expression. The expression 
cassette is located between Hind III and Eco' Rl restriction 
endonuclease sequences in pUC19 . 

Figure 6 is a schematic of EIPCR primer design. Line A 

3 0 shows the area of the wildtype leader sequence that was 

replaced by a library of leader sequences. Line B shows the 
design of the mutagenic primers relative to the template (SEQ 
ID NO: 26 and SEQ ID NO: 27) . Line C shows the sequence of 
the identified, positive single chain Fv linker conferring 
35 increased protein expression that was obtained from the random 

library (SEQ ID NO: 23) . 

Figure 7 is a schematic illustrating EIPCR promoter 
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library mutagenesis. Figure 7A is the template sequence. The 
underlined regions in Figure 73 indicate the regions of 
variability in the library. 

CF THE INVENTION" 

5 The invention is directed to a method for generating a 

recombinant mutagenesis library by introducing one or more 
changes within a predetermined region of double stranded 
nucleic acid,, comprising providing a first primer population 
and a second primer population, each population having a 
10 variable base composition at known positions along the 

primers, the primers incorporating a class IIS restriction 
enzyme recognition sequence, being capable of directing change 
in the nucleic acid sequence and being substantially 
complementary to the double-stranded nucleic acid to allow 
hybridization thereto. The method also comprises hybridizing 
the first and second primer populations to opposite strands or 
the double-stranded nucleic acid to form a first pair of 
primer-templates oriented in opposite directions, performing 
enzymatic inverse polymerase chain reaction to generate at 
least one linear copy of the double stranded nucleic acid 
incorporating the change directed by the primer, cutting the 
double stranded nucleic acid copy with a class IIS restriction 
enzyme to form a restricted linear nucleic acid molecule 
containing the change and introducing nucleic generated 
25 therefrom into compatible host cells. 

In a preferred embodiment, the method additionally 
comprises the step of joining termini of the restricted linear 
nucleic acid molecule to produce double stranded circular 
nucleic acid. The method preferably produces restricted 
3 0 linear nucleic acid molecules containing only the directed 

change in the nucleic acid sequence. Preferably the double 
stranded nucleic acid is circular DNA. The method can be 
performed on either eukaryotic or prokaryotic cells. 

In a preferred embodiment of the invention, the double 
3 5 stranded nucleic acid encodes polypeptide. The change in the 

nucleic acid can be introduced into the amino acid coding 
region of the polypeptide or into a regulatory region of the 
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polypeptide. Thus changes may be introduced into promoter and 
enhancer regions of the double stranded nucleic acid. The 
polypeptide encoded by the double stranded nucleic acid is 
preferably expressed from the host ceils. 
5 In another preferred embodiment of the invention, the 

double stranded nucleic acid comprises a viral vector and 
compatible host cells comprise a helper virus packaging cell 
line that directs the packaging of viral particles containing 
the viral vector. The viral particles are preferably 
10 collected and the method additionally comprises the step of 

infecting susceptible cells with the viral particles. 

In yet another preferred embodiment of the invention, a 
method is provided for improving polypeptide expression from 
a double-stranded nucleic acid sequence encoding polypeptide 
15 comprising: measuring polypeptide expression from the double 

stranded nucleic acid in a compatible host ceil, providing a 
first primer population and a second primer population, each 
of the populations having a variable base composition at known 
positions along the primers, the primers incorporating a class 
20 IIS restriction enzyme recognition sequence, being capable of 

directing change in the nucleic acid sequence and being 
substantially complementary to t he double stranded nucleic 
acid to allow hybridization thereto. The method additionally 
comprises hybridizing the first and second primer population 
25 to opposite strands of the double stranded nucleic acid to 

form a first pair of primer-templates orientated in opposite 
directions, performing enzymatic inverse polymerase chain 
reaction to generate at least one linear copy of the double 
stranded nucleic acid incorporating the change directed by the 
3 0 primers, cutting the double stranded nucleic acid copy with a 

class IIS restriction enzyme to form a restricted linear 
nucleic acid molecule containing the change, introducing the 
nucleic acid from the cutting step or the PCR step into host 
cells and measuring polypeptide expression from the modified 
35 nucleic acid in the cells, and identifying cells with 

expression levels greater than the expression levels measured 
in cells containing unmodified double stranded nucleic acid. 
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The method preferably additionally comprises the step of 
joining termini of the restricted linear nucleic acid molecule 
to produce modified double stranded circular nucleic acid and 
the method also preferably comprises the step of obtaining 
modified template from selected ceils. Preferably the 
modified nucleic acid sequence is identified and transferred 
into another nucleic acid sequence. The primers can direct 
changes in a regulatory sequence, including promoters, or the 
primers can direct changes in a polypeptide sequence. In a 
preferred embodiment the primers direct changes in a ribcsome 

binding sequence. 

In vet another preferred embodiment of this invention, a 
method is provided for generating a recombinant library using 
wobble-base mutagenesis comprising: providing a first primer 
copulation and a second primer population, said primers being 
substantially complementary to a region of double stranded 
nucleic acid encoding polypeptide to allow hybridization 
thereto, the primers having a variable base composition in the 
third position of a least one nucleotide codon corresponding 
to the double stranded nucleic acid and a class IIS 
restriction enzyme recognition sequence. The ' method 

additionally comprises hybridizing the first and second primer 
populations to opposite strands of the double stranded nucleic 
acid to form a first pair of primer-templates orientated in 
opposite directions, performing enzymatic inverse polymerase 
chain reaction -to generate at least one linear copy of the 
double stranded nucleic acid incorporating the change directed 
by the primers, cutting the double stranded linear nucleic 
acid with a class IIS restriction enzyme to form restricted 
linear nucleic acid molecule containing the change and 
introducing nucleic acid generated therefrcm into compatible 
host cells. The variable base ccdons preferably do not alter 
the corresponding animo acid sequence of the polypeptide. 

In a preferred embodiment the primers direct alterations 
in the leader sequence of the polypeptide. The leader 

sequence Is preferably the bacterial OmpA protein leader 
sequence of a fragment thereof and the leader sequence is 
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preferably linked to polynucleotide encoding light and heavy 
chain antibody fragments. 

DETAILED DESCRIPTION OF THE INVENTION 
The invention provides a novel method for rapid and 
5 efficient site directed mutagenesis of double-stranded linear 

or circular DNA . The method, termed Enzymatic Inverse 
Polymerase Chain Reaction (EIPCR) , greatly improves the 
utility of previous PCR techniques enabling rapid screening or 
selection of putative mutant to identify clones containing 

10 changes of interest . 

In one embodiment, oligonucleotide primers containing the 
desired sequence changes are used to direct PCR synthesis of 
a double-stranded circular DNA template (Figure i) . The 
primers are designed so that they additionally contain a class 

15 us restriction enzyme recognition sequence and a sequence 

complementary to the template for primer hybridization. The 
primers are hybridized to opposite strands of the circular 
template and direct the amplification of each strand to form 
linear molecules containing the desired mutations. The ends 

20 of the linear molecules are filled in with Klenow polymerase 

or T4 DNA polymerase and restricted with the appropriate class 
IIS restriction enzyme to produce compatible overhangs for 
circularization and ligation. 

EIPCR uses class IIS restriction enzyme recognition 

25 sequences in the mutated or non-mutated PCR primers. This 

type of recognition sequence is used because the cleavage site 
is separated from the recognition sequence and therefore does 
not introduce extraneous sequences into the final product. 
Restriction of the PCR products with a class IIS enzyme 

3 0 removes the recognition sequence and produces homogeneous 

termini for subsequent ligation. Class IIS recognition 
sequences therefore circumvent problems associated with 
ligating heterogeneous PCR termini since such termini will be 
cleaved off using a class IIS recognition enzyme. If the 

3 5 primers are designed with complementary cleavage sites, the 

resulting termini will have complementary overhangs which can 
be used for circularization of the linear molecules. Such 
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ccmp lemencary overhangs increase the efficiency of 
intramolecular ligation compared to blunt ends and result in 
a high percentage of correctly mutated clones. Thus, HIPCK 
allows efficient mutagenesis and production of homogeneous 
5 termini of any DNA template without incorporating extraneous 

sequences. El? CP. also allows mutagenesis at any location 
within a circular template independent of convenient 
restriction sequences . 

As used herein, the term "predetermined change" refers to 
10 a specific desired change within a known nucleic acid 

sequence. Such desired changes are commonly referred to in 
the art as sire directed mutagenesis and include , for example, 
additions, substitutions and deletions of base pairs. A 
specific example of a base pair change is the conversion of 
15 the first A/T bp in the sequence AGCA to a G/C bp to yield the 

sequence GGCA. It is understood that when referring to a base 
pair, only one strand of a double-stranded sequence or one 
nucleotide of a base pair need be used to designate the 
referenced base pair change since one skilled in the art will 
20 know the corresponding complementary sequence or nucleotide. 

As used herein, the term "class IIS restriction enzyme 
recognition sequence" refers to the recognition sequence of 
class IIS restriction enzymes. Class IIS enzymes cleave 
double-stranded DN'A at precise distances from their 
25 recognition sequence. The recognition sequence is generally 

about four to six nucleotides in length and directs cleavage 
of the DNA downstream from the recognition sequence. The 
distance between the recognition sequence and the cleavage 
site as well as the resulting termini generated in the 
3 0 restricted product vary depending on the particular enzyme 

used. For example, the cleavage site can be anywhere from one 
to many nucleotides downstream from the 3 ' most nucleotide of 
the recognition sequence and can result in either blunt cuts 
or 5' and 3' staggered cuts of variable length. Such 
3 5 staggered cuts produce termini having single-stranded 

overhangs. Therefore, "complementary cleavage sites" as used 
herein refers tc complementary nucleic acid sequences at such 
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single-stranded overhangs. Class 115 restriction enzyme 
recognition sequences suitable for use in the invention can 
be, for example, Aiw I, Bsa I, Bbs I , Bbu I, Bsm AI , Bsr I , 
Bsm I , BspM I, Ear I, Esp 31, Fok I, Hga I, Hph I , Mbo II, Pie 
5 I, SfaN I , and Mnl I. It is understood that the recognition 

sequence of any enzyme that utilizes this separation between 
the recognition sequence and the cleavage site is included 
within this definition. 

As used herein, the term "substantially complementary" 

10 refers to a nucleotide sequence capable of specifically 

hybridizing to a complementary sequence under conditions known 
to one skilled in the art. For example, specific 

hybridization of short complementary sequences will occur 
rapidly under stringent conditions if there are nc mismatches 

15 between the two sequences. If mismatches exist, specific 

hybridization can still occur if a lower stringency is used. 
Specificity of hybridization is also dependent on sequence 
length. For example, a longer sequence can have a greater 
number of mismatches with its complement than a shorter 

20 sequence without losing hybridization specificity. Such 

parameters are well known and one skilled in the art will 
know, or can determine, what sequences are substantially 
complementary to allow specific hybridization. 

As used herein, the term "a primer capable of directing" 

25 when used in reference to nucleic acid sequence changes refers 

to a primer having a mismatched base pair or base pairs within 
its sequence compared to the template sequence. Such 
mismatches correspond to the mutant sequences to be 
incorporated into the template and can include, for example, 

30 additional base pairs, deleted base pairs or substitute base 

pairs. It is understood that either one or both primers used 
for the PCR synthesis can have such mismatches so long as 
together they incorporate the desired mutations into the wild- 
type sequence. 

3 5 Thus, the invention provides methods of introducing at 

least one predetermined change in a nucleic acid sequence of 
a double-stranded DNA . Such methods include: (a) providing 
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a firsr primer and a second primer capable of directing said 
predetermined change in said nucleic acid sequence, said first 
and second primers comprising a nucleic acid sequence 
substantially complementary to said double— stranded DNA so as 
5 to allow hybridization, a class IIS restriction enzyme 

recognition sequence and cleavage sites; (b) hybridizing said 
first and second primers to opposite strands of said double- 
stranded DNA to form a first pair of primer-templates oriented 
in opposite directions; (c) extending said first pair of 
10 primer-templates to create double-stranded molecules; (d) 

hybridizing said first and second primers at least once to 
said double-stranded molecules to form a second pair of 
primer-templates; (e) extending said second pair of primer- 
templates to produce double-stranded linear molecules 
15 terminating with class IIS restriction enzyme recognition 

sequences; and (f) restricting said double-stranded linear 
molecules with a class IIS restriction enzyme to form 
restricted linear molecules containing said change in said 
nucleic acid sequence. 

2 0 Enzymatic Inverse Polymerase Chain Reaction (EIRCR) is a 

?CR-based method for performing site-directed mutagenesis. 
Mutations are introduced into a DNA by first hybridizing 
primers which contain the desired mutations to the DNA, 
referred to herein as mutant primers. The resulting primer- 
25 templates are enzymatically extended with a polymerase to 

yield an intermediate product. Repriming of the intermediates 
and polymerase extension will yield the final mutant product. 
Cohesive termini can be subsequently generated for 
circularizaticn of the linear products by intramolecular 

3 0 ligation. 

The invention is described with particular reference to 
Introducing a predetermined change into a circular template 
and recircularizing of the product to generate mutant copies 
of the starting template. However, one skilled in the art can t 
3 5 use the teachings and methods described herein to similarly 

generate mutations in linear templates. The primers designed 
for use on linear templates are similar to those used for 
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circular templates . Appropriate modifications of primers for 
use on linear templates are known to one skilled in the art 
and will be determined by the intended use of the final mutant 
product. For example, when generating circular products, 
5 either from a linear or circular starting template, it is 

beneficial to use primers containing complementary cleavage 
sites downstream from the class IIS recognition sequence. 
Such complementary sites greatly increase the efficiency of 
intramolecular ligation. With linear molecules, on the other 

10 hand, while it is beneficial in some cases for the primers to 

contain class IIS recognition sequences which produce single- 
stranded overhangs at their cleavage sites, such cleavage 
sites need not be complementary. For example, if the product 
is a linear molecule for subcloning into a vector, cleavage 

15 sites which are not complementary c^n be used for directional 

cloning of the product. Additiona , a blunt cleavage site 
can be used to eliminate sequence r- .irements for subcloning. 
Thus, depending on the desired product, the cleavage sites 
within the primers can be complementary or non-complementary. 

20 EIPCK primers are synthesized having three basic sequence 

components. These sequences are used for generating mutations 
and for enabling efficient formation of circular products 
without introducing unwanted sequences or requiring the use of 
template restriction sequences. The first sequence component 

25 of the primers is the region which directs the predetermined 

changes. This region contains the desired mutations which are 
to be introduced into the template. The length and sequence 
of this region will depend on the number and locations of 
incorporated mutations. For example, if multiple and adjacent 

3 0 mutations are desired, then the primer will not contain any 

. nucleotides within this region identical to the wild-type 
sequence. However, if the mutations are not located at 
adjacent positions, then the nucleotides in between such 
mutations will be identical to the wild-type sequence and 

35 capable of hybridizing to the appropriate complementary 

strand. Thus, the region can be from one to many nucleotides 
in length so long as it contains the desired mismatches with 
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the wild- type sequence. 

It is only necessary for one of the primers to contain 
the desired mutations but a larger number of bases can be 
mutagen! zed and a higher efficiency of correct mutarions can 
5 be obtained if both primers contain the desired mutations on 

each complementary strand. A strategy for designing EIPCR 
primers is outlined in Figure 2 . This strategy shows an 
example of a pair of primers which can be used for mutagenesis 
at two nonadjacent locations. One skilled in the art can use 

10 this strategy and the teachings described herein to design and 

use primers that incorporate essentially any desired mutation 
into a double-stranded DNA. The template containing the wild- 
type sequence is shewn in Figure 2 A (SEQ ID NC: 1) . Also 
shown are the desired nucleotide substitutions (arrows) . The 

15 actual primers are depicted in Figure 23 as the shaded 

sequence (SEQ ID NO: 2; SEQ ID NO: 3). The region of each 
primer containing the desired substitutions is complementary 
and corresponds to the opposite strand at the same location 
within the template (Figure 2C) (SEQ ID NO: 4). For primers 

2 0 A (SEQ ID NO: 2) and B (SEQ ID NO : 3) in Figure 2B, the mutant 

region would consist of the sequence GTTCC and its complement, 
respectively . 

The second sequence component of EIPCR primers is the 
region containing the class IIS restriction enzyme recognition 
25 seouence. The location of the recognition sequence is 5' to 

the mutant region and thus is incorporated at the termini of 
any extension products. Since recognition sequences are 
located at the ends of linear extension products, they can 
also contain additional 5' sequences to facilitate recognition 

3 0 and cleavage by a class IIS enzyme. For example, the primers 

• in Figure '23 (SEQ ID NO: ■ 2; SEQ ID NO : 3) contain four 
additional nucleotides 5' to the 3sa I recognition sequence 
(SEQ ID NO: 5) . 

Other sequences included within the recognition sequence 
3 5 component of EIPCR primers are the nucleotides between the 

recognition sequence and the cleavage site. The number of 
nucleotides will correspond to the distance between these two 



3NSCCCT- <vVO ?3">2257A'> 



WO 93/1225" 



PCT/U592/I0647 



sites and therefore will vary for different enzymes. For 
example, the primers of Figure 2 contain a 3sa I recognition 
sequence which is cleaved by Bsa I on opposite (SFQ ID NO: 5) 
strands one and five nucleotides, respectively, 3' to the 
5 recognition sequence, leaving a four nucleotide single-strand 

overhang. Generally, such overhang sequences within the 
primers are completely complementary to each other but can 
include limited mutations. Primers are synthesized with 
filler nucleotides placed 5' to the first cleavage site. The 
10 number of filler nucleotides corresponds to the distance 

between the particular class IIS recognition sequence used and 
its cleavage site. The sequence of such spacer nucleotides 
can, for example, correspond to wild-type or non-wi Id-tyoe 
sequences or to predetermined mutations. For generating just 
15 a few point mutations, it is beneficial to match these 

nucleotides to the wild-type sequence to increase the 
hybridization stability of the adjacent mutant primer region. 

Types of restriction enzyme recognition sequences to be 
used in the invention are those recognized by class IIS 
20 enzymes. These enzymes recognize the DNA through a sequence 

specific interaction and cleave it at a discrete distance 
downstream from the recognition sequence. The ability to 
cleave such sequences downstream provides a useful means to 
remove heterogeneous ends and to produce complementary termini 
25 for circularization while at the same time removing the 

recognition sequence from the final product. Specific 
examples of class IIS recognition sequences have been listed 
previously and are also listed in Figure 3 along with their 
nucleotide sequences and cleavage sites ( SEQ ID NOS : 5 through 
3° 20). Although recognition sequences having complementary 

cleavage sites associated with them are preferred, those which 
have blunt ended cleavage sites can also be used in the 
invention . 

The third sequence component of EIPCR primers is the 
35 region to be hybridized to the template DNA . This region must 

be sufficient - in length and sequence to allow specific 
hybridization to the template. The hybridized portion of the 
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primers must also form a stable primer-template which can be 
used as a substrate for polymerase extension. It is typically 
found 3 * to the mutant primer region and its sequence is 
determined with respect to the location of . the desired 
5 mutations. For example, for the primers shown in Figure 2 

(SEQ ID NO: 2; SEQ ID NO: 3), the hybridization region is 
twenty nucleotides in length and found 3 ? to the mutant 
region. However, the hybridization region can also be 5' to 
the mutant region. For this orientation, the mutant region 

10 must form a stable primer-template which can be used as a 

substrate for polymerase extension. Longer or shorter 
hybridization sequences can be used in this region so long as 
they are appropriately located with respect to the mutant 
region and also specifically hybridize to the template 

15 molecule. One skilled in the art knows or can readily 

determine the specificity of such hybridization regions for 
use in ETPCR primers. 

Thus, the invention also provides a synthetic primer for 
introducing at least one predetermined change in a nucleic 

2 0 acid sequence of a double-stranded circular DNA. The primer 

includes: (a) a class IIS restriction enzyme recognition 
sequence; (b) said predetermined change in said nucleic acid 
sequence; and (c) a nucleic acid sequence substantially 
complementary to said double-stranded DNA. The preferred 

25 orientation of the above regions (a) through (c) is m a 5 ! to 

3 ' direction. 

The above described primers can be, for example, 
hvbridized to a double-stranded circular or linear DNA 
molecule which has first been denatured. Denaturation can be 
30 performed, for example, using heat or an alkaline solution. 

Other methods known to one skilled in the art can also be 
used. 

Hybridization of the primers occurs on opposite strands 
of the circular template and in a location where the single- 
35 stranded overhangs of each primer's complementary cleavage 

site can be joined together by restriction and ligation. 
Preferably, such joining should occur so that the wild- type 
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sequence is reformed except for the incorporation of the 
desired mutations. One way to ensure proper sequence 
reconstruction is to design the primers such that their 
complementary cleavage sites overlap and are either identical 
5 to the template sequence or contain some or all of the desired 

mutations. Such primers, once hybridized to a double-stranded 
circular DNA , form primer-templates and can be extended with 
a polymerase. The first extension reactions of circular 
templates result in the synthesis of double-srranded circular 
10 products which can be concatenated. Depending on the extent 

of polymerization, t" e concatemers can be either partially cr 
completely double-stranded. It is necessary for 

polymerization to proceed sufficiently far to allow subsequent 
primer hybridization for a second extension reaction. Smaller 

15 circular DNAs result in a greater number of completely double- 

stranded products and also require shorter extension times 
compared to much larger circles. Small circular DNAs of less 
than 1.0 kb are known in the art. Such vectors are beneficial 
to use in the invention since they can accommodate large 

20 inserts (3 to 5 kb) and still be comparable in size to most 

standard cloning vectors. The plasmid pVX is a specific 
example of a 902 bp vector, Seed, B., Nuc . Acids Res. 11:2477- 
2444 (19S." ; ), which is incorporated herein by reference. Such 
vectors can be further modified by the addition of, for 

25 example, promoters, terminators and the like to achieve the 

desired end. Complete extension of a circular DNA of about 
5.0 kb can be achieved using the conditions described herein; 
however, alternative conditions used by those skilled in the 
art to achieve complete extension of larger circular DNAs can 

30 also be used to practice the invention. For linear templates, 

on the other hand, the- first extension reaction produces a 
double-stranded linear molecule known in the art as the long 
product . 

After one extension reaction, the double-stranded 
35 products, whether they exist as circular or linear molecules, 

have incorporated at one of their ends the EIPCR primer with 
its associated class IIS restriction enzyme recognition 
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sequenca and the desired, mutations. These double-stranded 
molecules can be used for a second cycle of hybridization and 
extension to produce double-stranded linear molecules which 
terminals at both ends with EIPCR primers . Further cvcles 
will result in the exponential amplification of template 
sequence located between each primer on the circular DNA. 
Thus, the location of the hybridized primers defines the 
termini of template sequences to be amplified. 

Polymerases which can be used for the extension reaction 
include ail of the Jen own DNA polymerases. However, if 
multiple cycles of hybridization and extension are to be 
performed, such as required for ?CH amplification, then 
preferably a thermostable polymerase is used. Thermostable 
polymerases include, for example, Taq polymerase, Vent 
polymerase and ?FU polymerase. Vent and PFU polymerase 
advantageously exhibit a higher fidelity than Taq due to their 
3 ' to 5' proofreading capability. 

Following synthesis of the linear molecules, the products 
are restricted with the appropriate class IIS restriction 
enzyme to remove the class IIS recognition sequence and 
heterogeneous termini and to create cohesive termini used for 
circularization. The resulting termini correspond to the 
single-strand overhangs produced after restriction of each 
primer's complementary cleavage site. To facilitate trocar 

2 5 recognition and cleavage, the linear products can be pre- 

treated with a polymerase, such as Klenow, under conditions 
which create blunt ends. This procedure will fill in any 
uncompleted product ends produced during amplification and 
allows efficient restriction of essentially all of the 

3 0 products. After restriction, the cohesive termini can be 

joined to recircularize the linear molecule. Covalentlv 
closed circles can subsequently be formed in vitro with a 
ligase. Alternatively, in vivo ligation can be accomplished 
by introducing the circularized products into a compatible 
3 5 host by transformation or electroporation , for example. 

Transformation or electroporation of the circularized 
products can additionally be used for the propagation and 



20 



NSCCCID: <vVO 93"-2257Ai> 



WO 93/1225" 



PCT/US92/ I064T 



-19- 

manipulation of mutant products. Such techniques and their 
uses are known to one skilled in the art and are described, 
for example, in Sambrook et ai., Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbor, Cold Spring Harbor, NY 
5 (1989), or in Ausubel et al. , Current Protocols in Molecular 

Biology, John Wiley and Sons, New York, NY (1939), both of 
which are incorporated herein by reference. Propagation and 
manipulation procedures do not have to be performed at the end 
of all EIPCR reactions. The need will determine whether such 
10 procedures are necessary. For example, transformation and DNA 

preparation can be eliminated if two consecutive EIPCR 
reactions are to be performed where the product cf the first 
reaction is used as the template for the second reaction. All 
that is necessary is that the first reaction products are 
15 circularized and iigated prior to hybridisation with the 

second reaction primers. Additionally, primers for EIPCR can 
be used without purification. EIPCR is not as sensitive as 
other methods to the presence of primers of incomplete length 
because the non-uniform DNA ends are removed by restriction of 
20 the class IIS recognition sequence. 

The invention further provides methods of producing at 
least two changes located at one or more positions within a 
nucleic acid sequence of a double-stranded circular DNA. The 
methods include: (a) providing a first population of primers 
25 and a second population of primers capable of directing said 

changes in said nucleic acid sequence, said first and second 
populations of primers comprising a nucleic acid sequence 
substantially complementary to said double-stranded DNA so as 
to allow hybridization, a class IIS restriction enzyme 
30 recognition sequence, and cleavage sites; (b) hybridizing said 

first and second populations of primers to opposite strands of 
said double-stranded DNA to form a first pair of primer- 
template populations orientated in opposite directions; (c) 
extending said first pair of primer-template populations to 
35 create a population of double-stranded molecules; (d) 

hybridizing said first and second populations cf primers at 
least once to said population of double-stranded molecules to 
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fom a second pair of primer-tempiara populations; (e) 
extending said second pair of primer- template populations to 
produce a population of double-stranded linear molecules 
terminating with class 115 restriction enzyme recognition 
5 sequences; and (f) restricting said population of double- 

stranded linear molecules with a class ITS restriction enzyme 
to form a population of restricted linear molecules containing 
said changes within said nucleic acid sequence. Also provided 
is a population of synthetic primers for producing at least 

10 two changes located at one or more positions within a nucleic 

acid sequence of a double-stranded circular DNA comprising: 
(a) a class ZZS restriction enzyme recognition sequence; (b) 
said changes within said nucleic acid sequence; and (c) a 
nucleic acid sequence substantially complementary to said 

15 double- stranded circular DNA - 

The method for producing at least two changes located a~ 
one or more positions is similar to tha~ described above for 
site-directed mutagenesis except that the primers can have 
more than one nucleotide at a desired position. For example, 

2 0 if it is desirable to produce mutations incorporating from two 

to four different mutant nucleotides at a particular position, 
then a population of primers should be synthesized such that 
all mutant nucleotides are represented within the entire 
oocuiation. £ach individual primer within the population will 
25 contain only a single mutant nucleotide. The proportion of 

primers containing identical mutant nucleotides will determine 
the expected frequency of that mutation being correctly 
incorporated into the final product. For example, if only two 
mutant nucleotides are desired and each one is equally 

3 0 represented within the primer population, then 50% of the 

products should contain one of the mutations and 50% should 
contain the other mutation. If more than two mutations are 
desired at a particular position or at more than one position, 
then primer populations should be synthesized which contain 
3 5 individual primers having each of the desired mutations. 

Primer populations can also be synthesized which direct single 
mutations at one position and multiple mutations at another 
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position by incorporating one or more mutant: nucleotides at 
the appropriate position. 

The design and use of such primers is identical to that 
previously described for introducing at least one 
5 predetermined change into a double-stranded circular DNA. The 

only difference is that instead of hybridizing a first primer 
and a second primer to form a pair of primer-templates, 
hybridization is with a first population of primers and a 
second population of primers to form a pair of primer-template 

10 populations. Each primer-template within the population can 

include, for example, one of the desired mutant sequences to 
be incorporated into the resultant products. Amplification of 
the primer-template population will produce a population of 
linear products containing ail desired mutations. The 

15 products can be restricted, circularized and screened for 

individual mutant clones. Screening can be performed, for 
example, by sequencing or by expression of polypeptide. 
Selection can be performed by linking polypeptide expression 
with the expression of a suitable marker such as an antibiotic 

20 resistance gene, iuciferase, or the like. Only colonies 

containing the gene are selected. Following selection, 
positive colonies can then be screened for a particular 
characteristic. Expression screening or selection offers the 
advantage of screening or selecting a large number of clones 

25 in a relatively short period of time. These assays permit the 

identification of clones of interest. Examples of screening 
and selection assays are well known to those with skill in the 
art. Each assay is designed and modified for that particular 
application. Examples of these assays are found in the 

30 examples below. 

The methods and primers described herein can be used to 
create essentially any desired change in a nucleic acid 
sequence. Templates can be linear or circular and result in 
products containing only the desired changes since class IIS 

35 recognition sequences allow the removal of extraneous and 

unwanted sequences. Product termini which are homogeneous in 
nature are also produced using the class IIS recognition 



WO 93/12257 



PCT/US92/10647 



-22- 

secuences. Use of circular templates allows the incorporation 
of mutations at any desired location along the template with 
subsequent recircularization of the mutant products. Thus, 
additions, deletions and substitutions of single base pairs, 
5 multiple base pairs, gene segments and whole genes can rapidly 

and efficiently be produced using EIRCR . A specific use of 
E1PCR would be in the mutagenesis of antibodies or antibody 
domains. Mutagenesis of antibody complementary determining 
regions (CDR) , for example, can be performed using EIRCR for 
10 the rapid generation of antibodies exhibiting altered binding 

specificities. Likewise, EIRCR can also be used for producing 
chimeric and/ or humanized antibodies having desired 
immunogenic properties . 

The efficiency of incorporating correct mutations into 
15 the product using EIRCR can be, for example, greater than 

about 9 0%, preferably about 9 5 to 99%, more preferably about 
100%. This efficiency is routinely obtained when using about 
0.5 to 2.0 ng of template in a 2 5 cyci: ?CR reaction. 
However, it should be understood that the efficiency directly 
20 correlates with the number of amplification cycles and 

inversely with the amount of template used. For example, the 
more amplification cycles which are performed, the greater the 
amount of mutant product present and therefore a larger 
fraction of mutant sequences will be present within the total 
25 sequence population. Conversely, if a large amount of 

template is used, more amplification cycles are required, 
compared to using a smaller amount of template, to achieve the 
same fraction of mutant sequences within the total sequence 
population. One skilled in the art knows such parameters and 
3 0 can adjust the number of cycles and amount of template 

required to achieve the required efficiency. 

The following examples are intended to illustrate but not 
limit the invention. 

35 

EXAMPLE I 
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This example shows the use of EIPCR for sire-directed 
mutagenesis of two bases located on a 2.6 Jcb pUC-based plasmid 
(designated pl36) . 

The design of the primers and their relationship to the 
5 template and to the final mutant sequence is shown in Figure 

2. The 2' end of the primer is an exact match cf 20 bases. 
The 5 ' ends of the primers comprise the enzyme recognition 
site and the enzyme cut site, which was designed to form 
complementary overhangs. Four additional bases were added 5 1 
10 to the enzyme recognition sequence to facilitate recognition 

and digestion of the PCR product by the enzyme . Two 
complementary mutations were designed into each of the 
primers . 3sa 1 was the enzyme used to make the overhangs 
(Figure 3 ) . 

15 PCR reactions were performed in 100 ^\ volumes containing 

0.2-1.0 uM of each unpurified primer, 0.5 ng uncut plS6 
template plasmid DNA, IX Vent buffer, 200 uM of each dNTP , 2.5 
units Vent polymerase (New England Biolabs, Beverly, MA). 
Thermal cycling was performed on a Perkin-Elmer-Cetus PCR 

2 0 machine (Emeryville, CA) with the following parameters: 

94°C/3 minutes for 1 cycle; S4 Q C/1 minute, 5C°C/1 minute, 
72°C/3-4 minutes for 3 cycles; 94°C/1 minute, 55°C/1 minute, 
72°C/3-4 minutes, with autoextens ion at 4-6 sec/cycle for 25 
cycles; followed by one 10 minute cycle at 72°C. 

2 5 To blunt the ends of the PCR product, the entire reaction 

mix was supplemented with 3 ul of 10 mM of dNTP mi,. cure (2.5 
mM each) and 20 units of Klenow fragment (Gibco-BRL, 
Gaithersburg, MD) incubated at 37°C for 30 minutes. The 
reaction was then extracted with an equal volume of 

30 phenol/chloroform (1:1), ethanol-precipitated , and the pellet 

was washed and dried. The blunt end product was then 
restriction digested with Bsa I (New England Biolabs, Beverly, 
MA) as recommended by the manufacturer. The digested DNA was 
extracted with an equal volume of phenol/chloroform, ethancl- 

35 precipitated, as described above, and ligated with 20 units T4 

DNA ligase (Gibco-BRL) for one hour at room temperature. Gei- 
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purification of the digested DNA before ligation was not 
necessary. After ligation, the DNA was transformed into 
competent DH103 ceils recommended by the manufacturer (Gibco- 
BRL) . 

Approximately 40 0 colonies were obtained from a 
transformation using 10 ng of DNA into 3 0 ui of frozen 
competent cells. The transformation efficiency was 4x10" 
cfu/ug of DNA. Seven colonies were randomly picked and 
plasmid DNA was prepared for restriction digests. No 
differences in restriction pattern were seen. The mutated 
areas of the plasmids of these seven colonies were sequenced. 
Double-stranded dideoxy sequencing was performed on a Dupont 
Genesis 2 0 00 au-omated sequencer using the Dupont Genesis 2000 
sequencing kit. The sequences of all seven plasmids contained 
the desired mutation. 

EXAMPLE II 

This example shows the use of EIPCR for constructing 
large libraries of protein mutants. 

The binding site of an antibody, called the Fv fragment, 
normally consists of a heavy chain and a light chain, each 
about 110 amino acids long. Using molecular modelling tools, 
several groups have constructed single chain Fv fragments 
(scFv) in which the c-terminus of one chain is connected by a 
10-15 amino acid linker to the n-terminus of the other chain 
(Huston, Bird, Glockshuber) . The single chain construct was 
shown to be much more stable than the two chain Fv. 

To eliminate the need for molecular modelling, EIPCR was 
used to make a large library of different linkers and screen 
for a scFv clone that is not only active but also expressed at < 
a high level. An antibody was chosen that binds a radioactive 
Indium chelate, Reardan et ai. , Nature 316:265-257 (1985), , 
which is incorporated herein by reference. A 3.5 kb pUC- 
derived plasmid was constructed in which both Fv chains are 
attached to ompA leader peptides and driven by a Lac promoter 
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(Figure 4) . This plasmid was used as the template for EIPCR 
in which the DNA between the c-terminus of the first chain and 
the n-terminus of the mature second chain was replaced by a 
random mixture of bases, encoding ^ library of random linkers. 
5 The design of the primers is shown in Figure 43 in the shaded 

region where N represents an equal proportion of all four 
nucleotides at the position within the primer population. 

Synthesis of the two primer populations used to construct 
the library was performed on a Milligen/Biosearch 37 00 DNA 

10 synthesizer. The mixed base positions were synthesized using 

a 1:1:1:1 mixture of each of the four bases in the U 
reservoir. The oligonucleotides were made trityl-cn and were 
purified with Nensorb Prep nucleic acid purification columns 
(NEN-Dupont , Boston, MA) as described by the manufacturer. 

15 PGR reactions were performed in 100 ^1 volumes containing 

0.5 uM of each unpurified primer, 0.5 ng pUCHAFvl template 
plasmid DNA, lx Tag buffer , 2 00 uM of each dNT? , 1 ul Tag 
polymerase ( Perkin-Elmer-Cetus) . Thermal cycling was 

performed on a Perkin-Elmer-Cetus PGR machine with the 

20 following parameters: 94°C/3 minutes for 1 cycle; 94°C/1 

minute, 50°C/1 minute, 72°C/2 minutes for 3 cycles; 94°C/1 
minute, 55°C/1 minute, 72°C/2 minutes, with autoextension at 
4 sec/cycle for 25 cycles; followed by one 10 minute cycle at 
72°C. 

2 5 The product of the 100 ul PGR was extracted with an equal 

volume of phenol/ch iroform (1:1), ethanol-precipitated , and 
the pellet was resuspended in 20 ul KKL buffer (50 mM Tris-KCl 
pH 7.6, 10 mM MgC12, 5 Mm DTT ; suitable for Klenow, Kinase and 
Ligase) containing 200 f*M dNTPs, 1 mM ATP, 10 units DNA 

3 0 Polymerase Klenow fragment and 10 units T4 DNA Kinase and 

incubated at 37°C for 3 0 minutes. Then 10 units T4 DNA ligase 
were added, and the reaction was continued for 2 hours at room 
temperature. The enzymes were then inactivated by heating at 
65°C for 10 minutes. The polymerized DNA was then digested 
35 with Bbs I (NEB) which cuts off the ends of the PGR fragment, 

inside the oligos. It was found that Bbs I digestion was 
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inefficient with only four bp 5' to the recognition sequence. 
Tc create a longer 5' extension and improve efficiency, the 
DNA was iigated before digestion. Alternatively, primers 
could have been synthesized with longer 5' extensions. The 
digested DNA was then extracted with phenol/ chloroform, 
ethane! precipitated, and resuspended in 2 0 ul Ix NZB ligation 
buffer, containing 1 mM ATP and 10 units T4 DNA ligase and the 
reaction was incubated for 2 hours at room temperature. 

One microliter amounts of the ligation reaction were 
eiectrcporated into 2 0 ui of DH10B Flectrcmax cells (Gibco- 
3RL, Gaithersburg, MD) to produce a library of scFv 
constructs. The Gibco-3Rl eiectropcrator and voltage booster 
was used as recommended by the manufacturer. Ceils were 
plated at 3,000 cfu/piate on plates containing 0.05 mM IFTG, 
15 to induce Fv expression. 

For screening, the labelled chelate was prepared by 
incubating 10 ul of 0.075 mM Eotube chelate with 50 uCi of 
buffered 111 Indium Chloride in a metal free tube. Colony lifts 
of the petri plates containing the protein library were 
20 prepared using 3A33 nitrocellulose filters (Schleicher and 

Schueli, Keene, NK) . The filters were blocked by incubation 
in Blotto (7% non-fat milk in PBS) for 10 minutes, washed with 
?BS r followed by incubation in Blotto containing 10 uCi of 
m Indium Chloride per filter for 1 hour at room temperature. 
25 The filters were then washed repeatedly with PBS for a total 

of 15 minutes, dried and exposed to Kodak X-omat AR 
autoradiography film for several hours. 

The quality of the protein library was determined by DNA 
sequencing of the linker of several unscreened clones. 
3 0 Sequencing was performed as described in Example 1. The 

composition of the mixed site res idues - was 19% G, 31% A, 25% 
T, 25% C (n=119) . 

The size of the library was determined by plating. In a 
typical eiectropcraticn, 30,000 cfu's were obtained from 
3 5 electropcraticn of 1 ul of ligation mixture into 2 0 ul of 

ceils. The ligation contained 0.1 ug of DNA in 2 0 ul. The 
librarv size was about 3x10' recombinants and the 
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electroporarion efficiency was 6xl0 6 cfu/ug. Approximately 
3 0,000 clones were screened, and about 60 colonies gave a 
range of signals on the primary screen (0.2%) . These with the 
strongest signal were colony purified and the DNA sequence of 
5 the linker was determined. The sequences of one _ inker from 

an identified scFv clone is shown in Figure 4C. 
LIBRARY MUTAGENESIS 

Library mutagenesis using a heterogenous primer 
population permits incorporation of a large number of 
10 mutations into a population of host cells to generate a 

recombinant library. The resulting mutations are typically 
introduced into a polynucleotide suitable for cell delivery. 
7;_a polynucleotide can additionally be adapted for expression. 
Ihese polynucleotides may contain langes in either the 
15 regulatory region of the polynu j ieot " - or in a translatable 

region. The directed mutations in tht lynucleotide sequence 
may alter levels of protein express.: alter a functional 
characteristic of a protein, or confer a particular ceil 
phenotype. The incorporation of a large number of mutations 

2 0 into a host population is termed library mutagenesis. In 

general, libraries can be prepared and screened for c an ess in 
any measurable cell property. Similarly, the tranr err ad or 
transfected cells containing the altered nuc. . c acid 
sequences can be screened or selected for ~ desired 
25 polynucleotide sequence independent of polypeptide expression. 

There are several different methods for performing 
library mutagenesis that are available to those of skill in 
the art. A number of these methods use PCR to produce a 

3 0 library of mutant constructs. However, none of the existing 

methods for making mutant libraries are based on inverse PCR. 

Enzymatic Inverse PCR (EIPCR) amplifies the entire 
piasmid, a portion of the plasmid or linear sequence of a 
polynucleotide. These methods differ from other mutagenesis 
3 5 methods in the use of class IIS restriction sequences in the 

5' end of both primers. Digestion with class IIS restriction 
enzymes, such as 3sal ( GGTCTCN ' NNNN ) , which have their 
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recognition sequence 5 1 to , and separated from, their cleavage 
site allows the removal of the entire recognition sequence 
prior to ligation. This preferably leaves the linear PCR 
product with compatible overhangs at each end. Intra- 
5 molecular ligation of the PCR product yields a full-length 

circular piasmid. 

An important advantage of EIPCR library mutagenesis is 
that any piasmid or DNA fragment can be used to create a 
library of mutations. The only limitation is the efficiency 
10 of the PCR process. The generation of a complementary strand 

is limited by the length of the template and by the elongation 
rate of the polymerase. It is likely that advances in the PCR 
technology , in particular , enzyme efficiency, will permit long 
DNA fragments to be used in this invention. The librarv 
15 mutagenesis methods disclosed herein are rapid and efficient 

and permit one of skill in the art to generate several 
libraries in a day. For example, once primers are prepared, 
libraries such as those prepared in Example III can be 
generated in 6 to 10 hours. 
20 In EIPCR library mutagenesis, the entire piasmid is 

amplified using mutagenic primers. The simple design of EIPCR 
results in a high efficiency of ligation of mutant piasmids, 
thus generating a high level of diversity in the library. The 
higher the level of genetic diversity in a recombinant 
25 library, the more likely the library will contain a mutant of 

interest readily identifiable by methods known to one of skill 
in the art . Another important benefit of EIPCR over other 
methods for library mutagenesis is that, as in EIPCR site- 
directed mutagenesis, mutations can be made in any area of the 
3 0 sequence independent of available restriction sequences. 

Restriction endonuclease recognition sites are not 
incorporated into the final construct. The usefulness of EIPCR 
for library mutagenesis, is described in Example III and 
illustrated in Figure 5. 
3 5 A method for performing library mutagenesis to generate 

a recombinant library by introducing changes within a 
predetermined region of linear or, preferably, circular double 
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stranded DNA is contemplated herein. The method comprises (a) 
providing a first primer population and a second primer 
population, each having at least one variable base at known 
complementary positions along the primers capable of directing 
5 a change in the nucleic acid sequence, the first and second 

primer populations being substantially complementary to the 
double-stranded nucleic acid to allow hybridization thereto 
and having a class IIS restriction enzyme recognition sequence 
and cleavage sites, (b) hybridizing the first and second 

10 primer populations to opposite strands of the double stranded 

nucleic acid to form a first pair of primer-templates oriented 
in opposite directions, (c) performing enzymatic PCR as herein 
before described, (d) cutting the double stranded linear 
molecules with a class IIS restriction enzyme to form 

15 restricted linear polynucleotide sequences containing the 

change in said nucleic acid sequence, thereby removing 
restriction endonuclease recognition sites, (e) optionally 
joining termini of the restricted linear molecules of step (d) 
to produce a double-stranded circular polynucleotide sequence, 

2 0 and (f ) introducing polynucleotide sequence obtained from step 

(d) or (e) into compatible host cells. 

The term "primer population" is used to describe the pool 
of primers that have identical base compositions except at 
certain predetermined locations along the sequence that 
25 contain a variable composition. The primers for EIPCR library 

mutagenesis are otherwise designed similar to those primers 
used for EIPCR site-directed mutagenesis. Primer pairs for 
EIPCR mutagenesis are designed to hybridize to the top and 
bottom strands of a double stranded template and to extend in 

3 0 opposite directions. The primers are chosen to be 

substantially complementary to that region of the nucleic acid 
template to be mutagenized. These primers may be overlapping 
on the template, contiguous, or non-overlapping. T h e 

primer pairs are substantially complementary to the template 
35 to facilitate hybridization during the PCR process. 

Preferably, the primer contains at least a 15 base region at 
the 3 ' end of the primer that is complementary to the 
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templata. Other regions of complementarity may be 

interspersed throughout the length of the primer. The primer 
additionally contains a class IIS restriction endonuc lease 
recognition sequence and a region containing ncnccmplementary 
5 bases "hat confers the desired variable mutation . The 

variable region can be of any length, the only restriction on 
length being the ability of the primers to hybridize to the 
template and direct synthesis of a substantially complementary 
strand of DNA. Further, the variable region or regions may be 
10 interspersed between complementary regions along the primer 

strand. Filler base regions can additionally be added to the 
primer at the 5 f end of the primer, before the class IXS 
recognition sequence, and between the class 113 recognition 
sequence and the class 115 cleavage site. Any final primer 
15 length is contemplated within the scope of the invention. 

Primer length is limited only by the efficiency of the 
oligonucleotide synthesizer. Primers may be prepared by 
methods )cnown to those of skill in the art. Those with skill 
in the art will be readily able to determine if a given primer 
2 0 adequately hybridizes to a given template and is thus suitable 

for amplification using EXFCR. 

The extent of primer variability desirable for library 
mutagenesis is determined during primer synthesis. A mixture 
of nucleotides, or polynucleotides such as amino acid encoding 

2 5 trimers, are introduced at one or more positions along the 

primer oligonucleotide. The addition of trinucleotide 
fragments during synthesis provides direct control over amino 
acid mixtures. The nucleotide mixture is formulated to 
contain a predetermined percentage of each of the fcur bases. 

3 0 These percentages may vary from 0% to less than 100% for any 

one base and from 0 to 100% for each of the 64 amino acid 
encoding trimers . The frequency of a given sequence is 
determined by the desired probability that a particular base 
or trimer will be present at a particular position along the 
3 5 primer. Thus, for example, if the library is to contain 

variable mutations at position 6 of the primer oligonucleotide 
corresponding to a 75% average likelihood that Dcsition 5 is 
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guanosine and a 25% average likelihood that position 5 will be 
adenosine, then the elongating primer will be exposed to a 
mixture of 3/4 guanosine and 1/4 adenosine at position 6. 
These mixtures can also be prepared in proportions such that 
5 for a region of 10 bases it is likely that on average only one 

of the 10 bases in any primer is different from the template 
sequence. This provides a primer pool that theoretically 
represents every possible permutation in each nucleotide 
position over a 10 base pair sequence. A review of primer 
10 preparation and design in random mutagenesis can be found in 

Oligonucleotides and Analogues: A Practical Approach (F . 
Eckstein Ed., Oxford University Press, 1991) and Hermes et 
al., Gene 34:143-151, 1989, which is hereby incorporated by 
reference . 

15 As illustrated in Figure 6 the primer pairs contain a 

complementary region at the class IIS restriction endonuclease 
cleavage site. In EIPCR library mutagenesis, this overlapping 
region preferably does not contain a mutation. This ensures 
that recircularization of the template can occur following PCR 

20 amplification. In the examples that follow, class IIS 

restriction endonuclease Bsal is used to generate a four base 
overhang at each end of the nucleotide sequence. Figure 3 
provides an exemplary list of other class IIS restriction 
endonucleases , contemplated within the scope of this 

25 invention. 

Library mutagenesis can be used to alter any region 
within a nucleic acid sequence. These mutagenesis procedures 
are particularly useful for generating a library of mutations 
within the mature region of a protein sequence, within a 

3 0 leader sequence, or within sequences that do not encode 

protein. Sequences that do not encode protein may influence 
or regulate protein expression. These include, but are not 
limited to non-coding regions on the DNA, for example, 
enhancer sequences, promoter regions, sites for DNA binding 

35 proteins such as repressors, Z-DNA formation, matrix 

associated regions, telomeres, origins of replication and 
recombination signals. In addition to those non-coding 
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regions on the DNA that are transcribed, non-coding regions on 
mRNA additionally contemplated include, but are nor limited to 
sriRNP's, spliceosomes , ribesome binding sites, regions of 
secondary structure, terminators , stability sites and cap 
5 sites. It is additionally contemplated within the scope of 

this invention that EIPCR library mutagenesis can be used to 
generate recombinant libraries containing altered sequences 
corresponding to tRNA or rRNA. Mutations in regulatory regions 
of a nucleic acid sequence can effect the level of protein 
10 expression, while in-frame substitution mutations within the 

nucleic acid sequence encoding protein can effect protein 
function. It is therefore contemplated that the procedures 
described herein will be useful for generating recombinant 
libraries having mutations in any cf these aforementioned 
15 regions of the nucleic acid. 

EIPCR library mutagenesis can be used to alter the 
functional characteristics of a particular protein. A protein 
sequence engineered into an expression construct can be used 
as a nucleic acid template for EIPCR library mutagenesis. 
Like other forms of library mutagenesis, this procedure can be 
used, for example, to mutagenize a binding region on a 
polypeptide, thereby generating an expression library that can 
be screened or selected for altered binding characteristics. 
EIPCR mutagenesis can also be employed to mutate a region or 

2 5 a polypeptide sequence that influences intra-mclecular 

binding. For example, a polypeptide region that linics two 
protein domains involved in ligand binding can be mutated, 
using the methods disclosed herein, to optimize the 
interactions between the protein domains. 

3 0 one type of mutagenesis contemplated within the scope cf 

' this invention is .wobble base library mutagenesis using EIPCR. 
Wobble base mutagenesis incorporates mutations within the 
primer population in positions that correspond to the third 
position of a nucleotide codon. Most mutations in the third 
3 5 position of a coden do not alter the amino acid sequence of 

the resulting polypeptide. Accurate tRNA-mRNA pairing is 
recuired at the first two positions within the codon during 
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translation. The third position can tolerate pairing with 
more than one tRNA and this degeneracy is termed a "wobble". 
Thus the same amino acid sequence can be derived from several 
different nucleotide sequences. 

Alterations in the nucleotide sequence that do not affect 
the protein sequence may alter the level of protein synthesis 
or expression within a given host. In particular, alterations 
in the nucleic acid sequence of the leader portion of a 
polypeptide can influence levels of protein synthesis from one 
protein to another or from one host to another. An example of 
two primers designed to confer alterations in the OmpA leader 
sequence that result in increased levels of antibody Fv 
fragment expression from E. coli is found in Figure 6. Once 
a leader sequence is optimized for the expression of one 
particular polypeptide, using EIPCR library mutagenesis, 
within a given host, it is further contemplated that this 
leader sequence can then be linked to other gene sequences 
encoding polypeptide to optimize expression of other 
polypeptide. Similarly it is also contemplated within the 
scope of this invention that other regulatory regions can be 
optimized using EIPCR library mutagenesis and that these 
optimized regions can be engineered into other expression 
constructs for maximal expression of other polypeptides in 
vitro or in vivo. 

The invention is preferably designed to incorporate one 
or more random changes within predetermined regions of a 
circular template, such as a vector. Vector choice is 
determined first by the choice of host cell used to create the 
desired library. It is well known to those of skill in the 
art that vectors are commercially available for protein 
expression in prokaryotic and eukaryotic systems. Expression 
vectors are available for bacteria, yeast and mammalian 
systems. In addition, viral vectors for both eukaryotic and 
prokaryotic cells are also contemplated within the scope of 
this invention. Expression vectors are required when the 
translation products from the mutated nucleic acid sequences 
are to be assayed. An analysis of random mutations in nucleic 
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acid may nor require the use of an expression vector where 
mutations can be screened using polynucleotide probes or the 
ljJcs- Those with skill in the art will be able to choose an 
acorooriate commercially available vector, create their own 
5 vector, or recreate the exemplary vector described in Example 

V below. 

It is additionally contemplated within the scope of this 
invention that EIPCR library mutagenesis could be performed on 
one region of nucleic acid within a construct, and a second 
10 (and/or subsequent) mutagenesis procedure be performed on 

another region of a construct or on a separate nucleic acid 
construct. Following amplification, these sequences can then 
be combined to produce a construct with two or more regions of 
random mutagenesis. 

A general description of the hybridization of aliquots of 
the first and second primer pools to the nucleic acid template 
as well as a general description of EIPCR are disclosed in the 
detailed description of site-directed mutagenesis beginning on 
page 16. The term " inverse" in enzymatic inverse polymerase 
chain reaction is used to describe the primer pair orientation 
during the PCR process such that at the initiation or 
elongation the 3 1 end of the primers are directed away from 
one another. The mechanics of hybridization and nucleic acid 
sequence amplification in library mutagenesis are similar to, 
25 if not identical to, those employed in EIPCR site-directed 

mutagenesis and will not be repeated here. • Thus, the term 
"performing EIPCR" as a step in the production of a library of 
mutations following the hybridizing step of the primers to the 
template, comprises 1) extending the first pair of primer- 
3 0 templates to create double stranded molecules; 2) denaturing 

the primer templates; 3) hybridizing the first and second 
primers at least once to the double stranded molecules to form 
a second pair of primer-templates; 4} extending the second 
pair of primer-templates following hybridization to produce 
3 5 double-stranded linear molecules terminating with class IIS 

restriction enzyme recognition sequences; and 5) repeating 
stats 1-3 as needed. 
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Once mutated linear template has been generated in 
sufficient quantity, the appropriate class IIS restriction 
enzyme is used to cleave the nucleic acid to create termini 
compatible for ligation. Ligation of the linear molecules is 
5 performed under conditions that favor recircularization of the 

plasmid. These conditions are well known to those with skill 
in the art and exemplary conditions are described in Example 
III. 

The nucleic acid is next introduced into the desired host 
10 cells. The nucleic acid can be introduced into the host ceils 

by any means known to those of skill in the art. These 
methods include, but are not limited to methods to prepare 
competent bacterial cells including CaCl 2 treatment, and 
methods to transfect eukaryctic cells including CaPO, 
15 precipitation, liposome mediated transf ection , viral 

infection, or electroporation . The method for introducing 
nucleic acid into the host cell will, in part, be determined 
by the host cell type. Descriptions of each the 

transformation and transfection procedures are found in 

2 0 recombinant methodology handbooks including those of Sambrook 

et al. or Ausubel et al . ( supra . ) Following transfection, 
transformation or infection, the ceils are expanded and 
screened for the desired cell function. There are a variety 
of screening assays that are available to the investigator. 
25 Assay design should reflect the desired goal of mutagenesis. 

For example, the assay disclosed in Example III below is 
designed to detect increased levels of expression of a 
particular antibody fragment in E. coll . Assays can also be 
designed to detect increases in the binding constants (K a ) of 

3 0 an antibody or receptor to its antigen or ligand. Other 

assays can be designed" to detect changes in the level of 
protein expression or changes in the functional activity of a 
protein. For example, in a eukaryotic system, the increased 
ability of a protein to promote growth or stimulate a 
35 particular cellular function can be measured by removing cell 

supernatants from mutated cells or their progeny, adding this 
supernatant to susceptible cells, and assaying for growth 
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promoting activity . Those with skill in the art: will be able 
to select an accrooriate screening cr selection assay for a 
particular library to identify a particular clone of interest. 

5 In a second example, EIPCR library mutagenesis can be 

used tc alter the expression of one polypeptide in relation to 
a second polypeptide. Thus in Example III below, random 
mutagenesis is used to increase the level of Fv heavy chain 
"expression, thereby equalizing levels of heavy and light chain 

10 Fv fragment expression. 

In general once a particular mutation is identified as 
conferring a desired property to a protein sequence, the cells 
are selected and expanded. The nucleic acid containing the 
desired mutation is isolated and sequenced. Identified 

15 sequences from mutations in regulatory regions of a nucleic 

acid sequence can then be genetically transposed to other 
expression systems. Thus, a contemplated method within the 
scope of this invention is one that identifies an optimized 
nucleic acid sequence derived from EIPCR library mutagenesis 

2 0 to promote an increase in the level of protein expression as 

compared with wild type sequence. 

The following examples of random EIPCR library 
mutagenesis are provided below. These examples are intended 
to illustrate but not limit the invention. 

25 

EXAMPLE III 

This example illustrates a preferred embodiment of EIPCR 
library mutagenesis, wobble base mutagenesis. In wobble base 
mutagenesis, mutations are introduced into the nucleic acid 

3 0 sequence without altering the amino acid sequence of the 

target protein. In. this example, the leader or signal 
sequence of a protein is variably mutated in the third base 
position of at least one codon to generate • a recombinant 
library that can be screened for colonies with increased 
3 5 levels of eukaryotic protein expression as compared with non- 

mutated controls. The expression level of foreign proteins in 
E. coli is determined by a large number of factors, and 
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expression level optimization is normally a slow and tedious 
process. For secreted proteins, like the exemplary antibody 
Fv fragments used here, optimization of expression is 
complicated by the difficulties associated with secreting a 
5 eukaryotic protein in a proiaryotic system. Without the 

optimized modifications generated by EIPCR library 
mutagenesis; described below, secretion and expression of 
eukaryotic proteins in prokaryotic systems is very low. 

In this particular example, expression of Fv fragment 
10 expression of an anti-metai-chelate antibody (CRA255) was 

optimized in E. coli . The Fv fragment was expressed in active 
form in the periplasm cf E . coli. Both the heavy and light 
chains of the Fv fragment, each with its own leader peptide, 
were placed under the control of a Lac promoter on a 1.3 kb 
15 plasmid. The CHA255 antibody binds a chelated radioactive 

metal ( 1:1 Indium or 9C Y chelate complex) to provide a simple 
screening assay to permit detection of functional antibody 
fragments. For optimization of expression or mutagenesis of 
other proteins and antibodies, other screening systems may be 
20 useful. 

Expression Vectors 

Any expression vector that can be amplified together with 
its insert is contemplated wirhin the scope of this invention. 
However, we have chosen to exemplify a relatively small 
25 plasmid (< 7kb) that is readily amplified by PCR. pMCHAFvl, 

the 1.8 kb expression vector used for EIFCR mutagenesis and Fv 
expression, is shown in Figure 5. The nucleic acid sequence 
encoding light chain of the Fv fragment is 5 1 to the nucleic 
acid sequence encoding the heavy chain of the Fv fragment. 
30 Each chain has its own OmpA signal peptide, and both chains 

are driven by a single Lac promoter . The OmpA signal sequence 
and Lac promoter sequence are vided in references from 

Mowa et al. and Reznikoff et respectively, which are 

hereby incorporated by reference (Mowa et al., J. Biol. Chem 
35 255:27-29, 1930, J. Mol. Biol. 143:317-328 (1930) and 

Reznikoff et al. (1980) "The Lac Promoter". The Ooeron . Miller 
et al. Eds. Cold Spring Harbor Press, NY.) The amibody genes 
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for CKA2 55 are the same as those used in Example I above. The 
codcns of the light and heavy chain are those obtained from 
the original mouse antibody sequence. Similarly , the OmpA 
leader sequence is the native sequence obtained from the OmpA 
5 protein nucleic acid sequence as described in Example I. 

pMCHAFvl was constructed from p MINIS (Figure 5) . pMINT3 is a 
1.0 Jcb expression vector which contains a' synthetic Lac 
promoter, supF (derived from tRNA-tyr, Huang et ai. , supra , ) 
as the selectable marker, and a rop" ColEl origin,, obtained 
10 from pUC (Pharmacia, Piscataway , N.J".) . The supF vectors are 

designed to be used with commercially available chemically or 
electro-competent E.coli MC1Q61/P3 cells (Invitrogen Inc., San 
Diego, CA) . These cells contain amber mutations in both the 
ampicillin and tetracycline drug resistance genes, located on 
15 a P3 incompatibility group plasmid. Thus the ?2 piasmid can 

co-exist with ColEl incompatibility group piasmids such as 
pUC. The P3 plasmid is too large to interfere with pUC 
plasmid purification . Transf ormants are selected on plates 
with 25 ug/ml ampicillin and 7.5 ug/ml tetracycline. 

20 

Oligonucleotide Synthesis for Wobble Mutagenesis 

The two oligonucleotides used to construct the library 
are shown schematically in Figure 6B . The oligonucleotides 
are designed to hybridize to opposite DNA strands of the 
25 pMCHAFvl template adjacent to the OmpA leader sequence. The 

resulting DNA and mRNA derived from this pool of mutated 
oligonucleotides Is a library of sequences, all encoding the 
same OmpA. protein sequence. The X in Figure 63 corresponds to 
the variable positions within the primer population. The 
3 0 sequences are provided as SEQ ID NO: 2 6 and SEQ ID NO: 27. 

Here the. N corresponds ' to the X in Figure 63. Primer 
oligonucleotides also contain R and Y base designations. The 
R indicates the incorporation of a purine and the Y indicates 
the incorporation of a pyrimidine. The limitation of purines 1 
3 5 or pyrimidines in the third position of the codon ensures that 

the amino acid sequence is not modified by the incorporation 
of random nucleotides. Constant regions within the primer are 
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coded by the appropriate base designation. The primers 
(moving 5 ! to 3) * contain, as indicated, filler sequence, a 
Bsal class IIS restriction endcnuclease recognition site, 
filler sequence, a Bsal cleavage site that forms the cohesive 
5 termini for circularization , a region comprising random base 

positions in the third position of the nucleotide codon, and 
a complementary region to anchor the primer to the template 
during hybridization. Oligonucleotide synthesis was performed 
on a Milligen/Biosearch 8700 DNA synthesizer (Milligen, 

10 Burlington, MA). The mixed base positions were synthesized 

using a fresh 1:1:1:1 molar mixture of each of the four bases 
in the U reservoir. The oligonucleotides were made trityl-on 
and were purified with Nenscrb Prep nucleic acid purification 
columns (NEN-Dupont, Boston, MA) as described by the 

15 manufacturer. 

Amplification conditions and generation of modified template. 

PCR was performed in a 100 volume. Each reaction 

contained 0.5 of each purified primer, 0.5 ng pMCHAFvl 

20 template plasmid DNA, lx Taq buffer, 200 of each dNTP and 

1 pl! Taq polymerase ( Perkin-Elmer-Cetus ) The thermo-cycling 
parameters were: 94 a C/3 min for 1 cycle; 94°C/1 min, 50°C/1 
min, 72*C/2 min for 3 cycles; 94°C/1 min, 55 3 C/1 min, 72°C/2 
min, with autoextension at 5 sec/cycle for 10 cycles; 94°C/l 

25 min, 55'C/l min, 72°C/3 min, 1 with autoextension at S 

sec/cycle for 12 cycles; followed by one 10 min cycle at 72 a C. 
In a PCR reaction, the primers direct the amplification of a 
linear DNA sequence of equal length to the template plasmid 
with an additional 11-14 bp extensions at each end of the DNA 

3 0 that includes the class IIS restriction sequence. 

PCR Product Manipulations 

The DNA obtained from 2-4 100 pi PCR reactions was 

flushed by addition of dNTPs to 200 ^M, 50 units DNA 

3 5 Polymerase Klenow fragment and 3 0 units T4 DNA Kinase and 
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incubated at 37°C for 30 ainutes. After phenol/chloroform 
extraction and precipitation, the DNA was digested with Bsal 
(New England Biolabs, Beverly MA) . The digested DNA was gel 
purified, ethane! precipitated, and ligated at low 
concentration and without polyethylene glycol to favor 
intramolecular interactions, thus favoring circularization of 
the nucleic acid as opposed to concatamer formation. The 
ligation was ethanol precipitated using ammonium acetate, 
washed twice with 80% ethanol, vacuum dried and resuspended in 
20 ul 0.1 x TE (Sambrook et al. , supra.) for eiectroporation. 
After digesting the 12-14 bp overhang with Bsal, the resulting 
cohesive termini were ligated intramolecular ly, and the 
ligation was electr operated into E. coli for expression 
analysis . 

Eiectroporation 

One microliter amounts of the ligation reaction were 
electrepcrated into 20 ui of MC1061/P3 cells (Invitrogen, San 
Diego, CA) using the Invitrogen electrcporatcr . Cells were 
plated on 23 x 23 cm plates as described above. 

Cell Growth Conditions 

For routine ceil growth that does not require foreign 
protein expression, the cells were grown in M9CA media (Merrii 
et al. , Proc. Natl. Acad. Sci. (USA) 74:43 35-43 3 9, 1979) which 
is hereby incorporated by reference . 

For colony lift screening assays, the cells were plated 
on 23x23 cm plates with CS agar (43 g/1 yeast extract, 24 g/i 
tryptone, 3 g/1 NaH2P04, 3 g/1 Na2HP04 , 15 g/1 agar) with 0.5 
ug/ml iscpropyithiogalactoside (IPTG) (Boehringer Mannheim, 
Indianapolis, IN) for induction of protein expression. 

For expression level determination, clones were grown in 
CS broth with 0,2 mM IPTG in baffled shaker flasks at 25 0 rpm 
for 30 hours at 30 C, with a boost of 0.2 volumes of 240 g/1 
yeast extract and 12 0 g/I tryptone after 13 hours. The Fv 
expressing constructs were grown at 3 0°C. CS broth permits 
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the use of higher levels of IPTG before over-expression of the 
foreign protein causes bacterial death. Thus, with CS broth 
most of the Fv protein can be found in the media rather than 
in the bacterial periplasm. 

5 

Size Determination of the Random Library 

The molar ratios of fresh bases were reflected accurately 
in the oligonucleotide pool as determined by the methods of 
Hermes et al. (Proc. Natl. Acad. Sci. (USA) 87:696-700, 1990) 

10 which is hereby incorporated by reference. The ratio of bases 

in the mixed sites within the PCR product was verified by DNA 
sequencing a representative sampling cf individual clones. 
The composition of the mixed site residues in the PCR producr 
was 19% G, 31% A , 25% T, 25% C (n=119) . 

15 The theoretical maximum complexity of the library is 

3xl0 9 different sequences. The actual size of the library was 
determined by plating. In a typical electroporation , 5 x 10 5 
colony forming units (cfu) were obtained from eisctroporation 
of 1 pi of ligation mixture into 20 pi of cells. The ligation 

20 contained 0.5 ^g of DNA in 20 ^1. The library si2e is thus 

about 1 x 10 7 and the efficiency was 2 x 10 7 cfu/ug. For this 
particular example, the screening assay was found to be more 
limiting than library size. 

2 5 Colony Screening Assav 

Colony lifts of 23cmx23cm plates with 0.3-1. x 10 5 
colonies were prepared using BA33 nitrocellulose filters 
(Schleicher and Schuell, Keene, NH) . The filters were blocked 
by incubation in 3% non-fat milk in 25 mM Tris-HCI pH7 . 5 for 

3 0 10 minutes, washed with 25 mM Tris, followed by incubation in 

25 mM Tris containing 50 uCi of chelated 11l Indium or 90 Yttriuin 
per filter for 1 hour at room temperature. The filters were 
then washed with 25 mM Tris for a total of 15 minutes, dried 
and exposed to Kodak X-omat AR autoradiography film for 
3 5 several hours. 

Approximately 5 x 10^ clones were screened, and a wide 



3NSDOCD: <WO ?312257A1> 



WO 93/12257 



PCT/US92/10647 



-42- 

range of signals were obtained on the primary screen. 
Bacterial colonies that corresponded to strong filter signals 
were purified by replating. These were again assayed for 
activity. Two colonies with very strong signals were colony 
5 purified and reassayed. The expression level of these two 

clones was about ten times that of the wildtype. Assay design 
for the expression of other antibody fragments in E . coli is 
outlined by Skerra et al. (Anal. Biochem. 196:151-155, 1991} 
which is hereby incorporated by reference. 

10 

Elimination of the effect of unintended mutations 

With any mutagenesis procedure there is a risk of 
introducing mutations in areas other than the target. To 
demonstrate that the observed increase in protein expression 
15 was the result of the nucleic acid sequence identified from 

the selected clone, a 13 0 bp fragment containing the mutated 
area was cloned back into wildtype pMCKAFvl DNA. This 
construct expressed more protein than the wildtype sequence, 
proving that the 10-fold increase in the level of protein 

2 0 expression as compared with wildtype controls is the result of 

the mutated sequence. 

DNA sequencing 

The sequence of the 13 0 bp fragment, containing the 
25 mutation that conferred increased protein expression was 

determined by double stranded dideoxy sequencing on a Dupont 
Genesis 2000 automated sequencer using the Dupont Genesis 2000 
sequencing kit. The DNA sequence of the 13 0 bp fragment 
differed from the wildtype sequence only at the targeted 

3 0 wobble bases, confirming that the amino acid sequence was not 

. altered by the mutagenesis procedure. No mutations outside of 
the targeted wobble bases were observed. The optimized 
sequences obtained by this method are provided in Figure 6C 
and listed as SEQ ID NO: 23 and SEQ ID NO: 29. These 
3 5 sequences can then be further defined to more specifically 

determine the expression promoting regions contained therein. 
Therefore SEQ ID NO : 2 3 and SEQ ID NO: 29 or fragments thereof 
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can be used in subsequent expression systems to promote the 
expression of the same or different protein. 

Fv expression level quantitation 

The expression level of Fv fragments was determined by 
assaying cell free supernatants . Wiidtype and purified mutant 
colonies were grown under expression conditions in CS broth as 
described above. Dilutions of antibody containing samples 
were incubated with radiolabeled metal-chelate . After 
incubation for one hour, the free, unbound metal chelate was 
separated from the antibody-bound metal chelate by 
centrifugation through a Millipore ultrafree filter (molecular 
wight cut-off of 10,000 MW, Millipore, Bedfo;i, MA) Samples 
of the filtrated and the pre-f i Itration mixture were counted 
15 for radioactivity, yield" ng a "fraction bound". A standard 

curve of "fraction bound' ?.rsus known amounts of antibody was 
constructed. The amou:.: of Fv in an unknown sample was 
determined from the standard curve. The results of the assay 
indicated that the mutants reproducibly expressed 10 times 
more active Fv fragment than the original construct. 

The protein sequence of the antibody fragments in this 
example is not altered by wobble base mutagenesis. Therefore 
any difference in signal strength in the screening assay is 
due to differences in expression levels. However, the 
25 expression level may be affected by the mutation in several 

ways. The mRNA stability could be improved by the mutation. 
Similarly, initiation and translation from the ribosome may be 
improved. Further, protein expression is strongly influenced 
by the sequence of the first few codons following the' ATG 
initiation codon (Bucheler et ai., Gene 98:271-276, 1990. 
Therefore, wobble base mutagenesis can potentially influence 
polypeptide expression in a number of ways depending on where 
the mutagenic primers bind to nucleic acid and which random 
mutations are conferred upon the secuence. 
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EXAMPLE IV 

In another preferred embodiment of this invention, EIPCR 
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is used to create a promoter library for gene expression in E. 
ccii. 

In this particular example of the preparation of a 
promoter library, Fv fragment expression of the anti-metal 
5 chelate antibody (CKA255) is optimized using a population of 

primers with variable sequences in the promoter region (Figure 
7) . 

Expression Vectors 

In this example, the plasmid used is pCCHAVll, a 2 . 4 kb 
10 plasmid containing the Lac promoter followed by an OmpA leader 

sequence linked to the antibody light chain fragment sequence 
and followed by an optimized CmpA sequence linked to the 
antibody heavy chain fragment. Both antibody chain sequences 
are driven by a single Lac promoter. This optimized OmpA 
sequence (SEQ ID NO: 25) is derived from Example III. Plasmid 
dCCHAVII is LacI negative, chloramphenicaol resistance gene 
positive with a Rop" Co 12 1 origin. In this example a 

second copy of the Lac promoter region is placed in front or 
the antibody heavy chain fragment sequence. The nucleic acid 
sequence is provided in Figure 7A (ID SEQ NO: 3 0) and the 
inserted promoter sequence is provided in Figure 73 and as ID 
SEQ NO: 33. The inserted region includes the Lac promoter 
library region followed by the wildtype Lac operator followed 
by the ribcsome binding site. The sequence including the 
25 ribosome binding site is provided in ID SEQ NO: 34. 

Oligonucleotide Synthesis 

The primers used to create the recombinant promoter 
library are provided as ID SEQ NO: 31 and ID SEQ NO: 32. ID 
SEQ NO: 31 directed mutations to the ribosome binding site 
hile ID SEQ NO: 3 2 directed changes to the Lac promoter 
■egion. In Fig-are 73 the ribosome binding site, the -10 and 
he -3 5 regions of the Lac promoter are underlined and the 
sequence is provided as ID SEQ NO: 3 4 and ID SEQ NO: 3 3 
respectively. The bold underlining in Figure 73 corresponds 
to the primer regions in ID SEQ NO: 31 and ID SEQ NO: 3 2- that 
are underlined. The underlined portions are those positions 
along the primer that contain variability. The expected 
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frequency of variability at each nucleotide position is 
derived from a mixture 75% of template nucleotide and 3.3% for 
each of the remaining three nucleotides. For example, in ID 
SEQ NO: 31, the first underlined position is a cytosine. The 
5 expected bias of the primer populat.-cn at this position is: 

75%:C, 8.3%:G, 8.3%:T, 8.3%:A- Libraries were created using 
primer populations based on ID SEQ NO: 31 and ID SEQ NO: 32. 
Other libraries were created using one biased primer 
population while the other member of the primer pair contained 

10 no variability. As an example, a recombinant library was 

created using ID SEQ NO: 31 to prepare a variable first primer 
pool, while the second primer corresponded exactly with ID SEQ 
NO: 3 2 and therefore contained no variability. The library 
generated from these primers contains mutated sequences at the 

15 ribosome binding site and a constant Lac promoter sequence. 

The oligonucleotides comprise a Ssal restriction endonuclease 
recognition site, a region of variability reflected in the 
underlined portion of ID SEQ NO: 3 1 and ID SEQ NO: 32, and a 
region complementary to the template. 

20 

PCR Amplification and Product Manipulation 

Sequences were amplified using conditions outlined in 
Example III. Following amplification the nucleic acid was 
cleaved with 3sal and ligated. Nucleic acid was 

25 electroporated into E. coli. 

Colony Screening Assay and Identification of Positive Clones 
The screening assay is described in Example III. 
Colonies with increased levels of hapten binding are 
3 0 identified and colony purified. These colonies are expanded 

and analyzed for the. presence of unintended mutations. 
Optimized promoter sequences are identified by sequencing the 
expression plasmids from positive colonies. 
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EXAMPLE V 

In yet another preferred embodiment of this invention, 
EIPCR is employed to create a eukaryotic mutagenesis library. 
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Similar to EIPCR in E. cell , any region of a ei<aryctic vecror 
can be modified. Eukaryotic expression vectors may be 
modified in regulatory regions or within translated regions of 
a particular gene. In this example, a retroviral expression 
5 vector pUN is used to generate a library of mutations within 

the ribosome binding site of the Neomycin resistance gene. The 
ribosome binding site, also known as a Kozak sequence (Kozak, 
M. , Nuc. Acids. Res. 12(2)1357-72, 1984 which is hereby 
incorporated by reference) is a highly conserved region in 
10 eukaryotic ceils comprising the consensus sequence 

CCACCATG(G) . 

Expression Vector 

The retroviral expression plasmid pLN was obtained from 
15 A.D. Miller and is described in a publication by Miller et al. 

(BioTechniques 7 (9) : 980-990, 1989 which is hereby incorporated 
by reference) . The vector contains two Moloney Murine 
Leukemia Virus (MoMuLV) long terminal repeats (LTR) . Between 
the LTR regions is the Neomycin resistance gene (Neo r ) . The 

2 0 Neo r ribosome binding site is targeted for library mutagenesis 

to confer increased resistance to G418 in the eukaryotic ceil 
line NIH 3T3 (ATCC) . The plasmid has a final size of 6 kb. 

Oligonucleotide synthesis 
25 Oligonucleotides are prepared that are similar in design 

to those described for Example I above. The primers are 
designed to flank the Neo r ribosome binding site and are 
substantially complementary to both strands cf DNA. A short 
(4-10 bp.) variable region is designed to overlap the ribosome 

3 0 binding site. Thus, the oligonucleotides contain a class IIS 

recognition site, the variable region, and a twenty base 
complementary region that anchors the oligonucleotides to the 
pLN plasmid . 

3 5 Amplification Conditions 

Reaction tubes are prepared for PCR in a final 100 ul . 
reaction volume . Reaction conditions are optimized from 
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initial reaction conditions as outlined in Example III. 
Following PGR, the DKA is purified, cleaved with the desired 
class 115 restriction endonuclease, recircuiar ized and 
ligated . 

5 

Isolation of Packaged Vector 

Ligated product from the PGR reaction is eiectroporated 
into the helper virus packaging cell line PE501 obtained from 
A . D . Miller and described by Miller et al. , supra. Mutated pLN 
10 is transiently packaged into retroviral particles using PE501. 

Cell supernatant containing viral particles is harvested from 
the packaging cell line and titered on virus susceptible NIK 
2T3 cells (ATCC) . 

15 Selection and identification of Mutated Sequences 

Colonies expressing mutations are selected with elevated 
levels of G418, preferably between 0.75 -2.5 mg/mi These 
colonies are expanded, lysed, and if desired, the DNA is 
purified. The optimized promoter region is retrieved from the 

20 selected ceils by PCR. This new Kozak sequence can then be 

reintroduced into pLN to verify that the new sequence confers 
elevated G413 resistance. The region is sequenced to identify 
the selected nucleic acid sequence. The results from this work 
permits the identification of sequences conferring increased 

25 G41S resistance and facilitates the identification c: Kczak 

sequence requirements and the isolation of improved sequences 
that can be transferred to other constructs to improve the 
expression of other protein sequences. 

It is additionally contemplated that this technology 

30 could be applied to any gene in combination with a selectable 

marker such as Neo r . , Therefore any gene or portion of a gene 
can be mutated and initially selected by its resistance to 
Neomycin. Subsequent selection will be required to 

distinguish the optimized mutation. Neomycin resistance is 

35 just one of a variety of selection systems useful for EIPCR 

library mutagenesis applications. For example, as a selection 
procedure, transfected cells can be screened by a Fluorescent 



WO 93/12257 



PCT/US92/1064? 



-48- 

Activated Ceil Sorter (FACS) and positive colonies expanded 
from these cells for further analysis. 

Thus, HI? CP. library mutagenesis is a reliable and 
efficient method for obtaining optimized nucleic acid 
5 sequences. EIPCR reactions have an efficiency of 95% or 

better in reactions designed to measure the efficiency of 
mutagenesis. EIPCR library mutagenesis is generally 

applicable for de novo design or redesign of protein or 
nucleic acid sequences, 
10 Although the invention has been described virh reference 

to the above examples, it should be understood that various 
modifications can be made by those skilled in the art without 
departing from the invention. Accordingly, the invention is 
limited onlv by the following claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION : 

(i) APPLICANT : STEMMER, WILLEM 

(ii) TITLE OF INVENTION: ENZYMATIC INVERSE POLYMERASE CHAIN 
REACTION 

(iii) NUMBER OF SEQUENCES: 32 

fiv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE : KNOBBE , MARTENS , OLSON £ 3EAR 

(3) STREET: 620 NEWPORT CENTER DRIVE, SIXTEENTH FLOOR 

(C) CITY: NEWPORT BEACH 

( D ) STATE: CALIFORNIA 

(E) COUNTRY: UNITED STATES 

(F) 21?: 92550 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk: 
(3) COMPUTER: IBM PC compatible 
(C) OPERATING SYSTEM: PC-DOS /MS -DOS 

{ D ) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 
(A) APPLICATION NUMBER: 
(3) FILING DATE: 
(C) CLASSIFICATION: 

(viii) ATTORNEY / AGENT INFORMATION: 
(A) NAME: ISRAEL SEN , NED A. 
<B) REGISTRATION NUMBER: 29,655 
(C) REFERENCE /DOCKET NUM3ER: HYBRIT . 00 1C?1 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: 619-235-3550 
<S) TELEFAX : 619-235-0139 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 54 base pairs 
(3) TYPE: nucleic acid 

(C) STRAND ED NESS : double 

(D) TOPOLOGY: circular 
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(ii) MOLECULE TY?Z: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
5 AAATCTGGAG CCGGTGAGCG TGGGTCTCGC GGTATCATTG CAGCACTGGG GCCA 54 

(2) INFORMATION FOR SEQ ID NO: 2: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 6 base pairs 
(3) TYPE : nucleic acid 
(CJ STRANDED NESS : single 
(D } TOPOLOGY: linear 



1 5 



20 
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(xi) SEQUHUCZ DESCRIPTION: SEQ ID NO: 2: 
ATTAGGTCTC GGTTCCCGCG GT ATC ATT GC AG C ACT 

(2} INFORMATION FOR SEQ ID NO : 3 : 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 3 5 base pairs 
25 (3) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: ii; 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 
AATTGGTCTC GGAA.CCACGC TCACCGGCTC CAGAT 

(2) INFORMATION FOR SEQ ID NO : 4 : 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 54 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
4 0 (D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: cDNA 



43 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

AAATCTGGAG CC0G7GAGCG TGGTTCCCGC GGTATCATTG CAGCACTGGG GCCA 

5 0 (2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 11 base pairs 
(3) TYPE: nucleic acid 
5 5 (C) STRANDEDNESS: single 
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( D ) TOPOLOGY : linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 
5 GGTCTCNNNN N 

(2) INFORMATION FOR SEQ ID NO: 6: 

10 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 12 base pairs 
(3) TYPE: nuc 1 aic acid 

(C) STRANDEDN _ JS : single 

(D) TOPOLOGY: linear 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 
2 0 GAAGACNNNN NN 

(2) INFORMATION FOR SEQ ID NO : 7 : 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D ) TOPOLOGY : linear 



30 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 



CTCTTCNNNN 



[2) INFORMATION FOR SEQ ID NO : 3 : 



10 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 7 base pairs 
4 0 (3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 
GAATGCN 

(2) INFORMATION FOR SEQ ID NO : 9 : 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 14 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
5 5 (D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 
ACCTGCNNNN NNNN 14 

(2) INFORMATION ;CR SEQ ZD NO: 10: 

(1) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 10 base pairs 
10 (3) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

15 

GGATCNNNNN 10 



20 



(2) INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 17 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDED NESS : single 
25 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 11 : 

GCAGCNNNNN NNNNNNN 

30 

(2) INFORMATION FOR SEQ ID NO: 12: 



(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 10 case pairs 
(3) TYPE: nucleic acid 

( C ) STRANDEDNESS : single 

( D ) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION r SEQ ID NO: 12: 
GTCTCNNNNN 10 



Ad (2) INFORMATION FOP, SEQ ID NO : 13 : 

( i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 5 base pairs 
(3) TYPE: nucleic acid 
50 ( C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 13 : 
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ACTGGN 



(2) INFORMATION FOR SEQ ID NO: 14: 

5 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 13 base pairs 
( 3 ) TYPE: nucleic acid 
(C) STRAND EDNESS : single 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

GGATGNNNNN NNNNNNNN t q 

15 ^ S 

(2) INFORMATION FOR SEQ ID NO: 15: 

( i ) SEQUENCE CHARACTERISTICS : 
20 (A) LENGTH: 15 base pairs 

(3) TYPE: r.ucieic acid 
<C) STRANDED NESS : single 
(D) TOPOLOGY: linear 

25 (xi) SEQUENCE DESCRIPTION; SEQ ID NO: 15: 

GACGCNNNNN NNNNN 15 

30 (2) INFORMATION FOR SEQ ID NO: IS: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 13 base pairs 
(3) TYPE: nucleic acid 
3d (CJ STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

4 0 GGTGANNNNN NNN - t 



(2) INFORMATION FOR SEQ ID NO: 17: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 13 base paars 
( 3 ) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



55 



(xi) SEQUENCE DESCRIPTION-: SEQ ID NO : 1 7 : 
GAAGANNNNN NNN 
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(2) INFORMATION* FOR SEQ ID NO: 13; 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 10 bass pairs 

(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GAGTCNNNNN 

15 (2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 11 base pairs 
(3) TYPE: nucleic acid 

2 0 (C) S TRAND ED NESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
25 GCATCNNNNN NNNN 

(2) INFORMATION FOR SEQ ID NO: 20: 

3 0 (ij SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D } TOPOLOGY: linear 



3n 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
CCT CNN NNNN N 

40 

(2) INFORMATION -OR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 90 base pairs 
45 (3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: circular 



50 



10 



14 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
AGGAACCAAA CTGACTGTCC TAGGATAGAA GGAGATATAT CAT G AAAAAG ACAGCTGGCG- 60 
CAGGCCGAGG TGACCCTGGT GGAGTCTGGG 50 
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(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 53 base pairs 

(E) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

ATTAGAAGAC TACTCCNNNN NNNNNNNNNN NNNNNNNGAG GTGACCCTGG TGGAGTCT 5 3 

15 (2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 5 3 base pairs 
(3) TYPE: nuclexc acid 
2 0 (C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

2 5 AATTGAAGAC ATGGAGNNNN NNNNNNNNNN NNNNNNTCCT AGGACAGTCA GTTTGGTT 5 3 

{2) INFORMATION FOR SEQ ID NO: 24: 

3 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 94 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: circular 

2 5 

(ii) MOLECULE TYPE : cDNA 



(ix) FEATURE: 
4 0 (A) NAME /KEY: CDS 

(B) LOCATION: 2. .94 



45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 4 : 

A GGA ACC AAA CTG ACT GTC CTA GGA CGG AAA TCG GGG CGG TCT ACC 4 6 

Giy Thr Lys Leu Thr Val Leu Gly Arg Lys Ser Giy Arc Ser Thr 



5 0 TCC CCT CTC CCA ATA AAA TTA GGG GAG GTG ACC CTG GTG GAG TCT GGG 9 4 

Ser Pro Lau Pro lie Lys Leu Gly Glu Val Thr Leu Val Glu Ser Gly 
20 25 30 



55 (2) INFORMATION FOR SEQ ID NO: 25: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 
(3) TYPE : amine acid 
(D) TOPOLOGY: linear 

(il) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

10 Gly Thr Lys Leu Thr Vai Leu Gly Arg Lys Ser Gly Arg Ser Thr Ser 

15 10 15 

Pro Leu Pro Xle Lys Leu Gly Giu Val Thr Leu Vai Giu Ser Gly 

20 25 30 

15 



(2) INFORMATION FOR SEC. ID NO: 25: 

(L) SEQUENCE CHARACTERISTICS: 
2 0 (A) LENGTH: 72 base pairs 

(3) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

2 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

TCTATAGGTC TCTTTGCNGT NGCNCTNGCN GGNTTYGCNA CNGTNGCNCA RGCNGAGGTG 
50 

3 0 ACCCTGGTGG AG 

72 



(2) INFORMATION FOR SEQ ID NO: 27: 

35 

( i } SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 55 base pairs 
(3) TYPE: nucleic acid 
(C) STRAND ED NESS : single 
4 0 (D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

TATTAAGGTC TCAGCAATNG CRATNGCNGT YTTYTTCATG ATATATCTCC TTCTAT 
4 5 55 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 53 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: circular 



50 



5r> 
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(ii) MOLECULE TYPE: cDNA 

(xjl) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

ATG AAA AAA ACC GCG ATC GCC ATT GCT GTG GCG CTT GCC 

39 

MET LYS LYS THR ALA ILE ALA ILE ALA VAL ALA LEU ALA 
15 10 

GGC TTT GCT ACG GTG GCG CAG GCA 

63 

GLY PHE ALA THR VAL ALA GLN AJ^A 
15 2 0 

(2) INFORMATION FOR SEQ ID NO: 29: 



(i) SEQUENCE CHARACTERISTICS: 
2 0 (A) LENGTH: 53 base pa.rs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: circular 

25 (ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

3 0 ATG AAA AAA ACT GCA ATT GCG ATT GCT GTT GCT CTT GCT 

39 

MET LYS LYS TKR ALA ILE ALA ILE ALA VAL ALA LEU ALA 

15 10 

3 5 GGT TTC GCG ACG GTA GCA CAG GCC 

63 

GLY PHE ALA THR VAL ALA GLN ALA 
15 20 



(2) INFORMATION FOR SEQ ID NO: 30: 



(1} SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 13 base pairs 
4 5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
AAGGAGATAT ATC 



DO 
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(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 85 base pairs 
5 (3) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY; linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 1 : 

AACTATTGGT CTCAGTGGAA TTGTGAGCGG ATAACAATTT CACA CAGGAA ACAGCTA TGA 
50 

15 AAAAAACCGC GATCGCCATT GCTGT 

35 



20 



35 



50 



(2) INFORMATION FOR SEQ ID NO: 32; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 109 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS: single 

2 5 (D ) TOPOLOGY; linear 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 32: 

ATCATTAGGT CTCACCACA C AACATACGAG CCGGAAGCAT AAAGTG T AAA GCCTGGGG1G 

3 0 50 

AAAAAAAAAG GCTCCAAAAG GAGCCTTTCT ATCCTAGGAC AGTCAGTTT 

109 



(2) INFORMATION FOR SEQ ID NO: 53: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs ' 
4 0 (3) TYPE: nucleic acid 

(C) STRANDEDNESS; double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
CACCCCAGGC TTTACACTTT ATGCTTCCGG CTCGTATGTT G 



(2) INFORMATION FOR SEQ ID NO: 34: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 12 base pairs 
55 (3) TYPE: nucleic acid 



^NSDCCiC: <vVO 33T2257A1;. 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
CAGGAAACAG CT 12 



0 
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We Claim: 

1- A method for gsnerating a reccsibinan - mutagenesis 
5 library by introducing one or more changes within a 

predetermined, region of double stranded nucleic acid , 
comprising: 

(a) providing a first, primer population and a 
second primer population, each said population having a 

10 variable base composition at known positions along said 

primers, said primers incorporating a class IIS 
restriction enzyme recognition sequence, being capable 
of directing change in said nucleic acid sequence and 
being substantially complementary to said double- 

15 stranded nucleic acid to allow hybridization thereto; 

(b) hybridizing said first and second primer 
populations to opposite strands of said double stranded 
nucleic acid to form a first pair of primer-templates 
oriented in opposite directions; 

2 0 (c) performing enzymatic inverse polymerase chain 

reaction to generate at least one linear copy of said 
double stranded nucleic acid incorporating said change 
directed by said primers ; 

(d) cutting the double stranded nucleic acid copy 
25 of step (c) with a class IIS restriction enzyme to form 

a restricted linear nucleic acid molecule containing 
said change; and 

(e) introducing nucleic acid generated from step 
(c) or (d) into compatible host cells. 

3 0 2. The method of Claim 1, additionally comprising the 

step of joining termini of said restricted linear nucleic 
acid molecule of 'step (d) to produce double-stranded 
circular nucleic acid. 

3. The method of Claim l, wherein said restricted 

3 5 linear nucleic acid molecule produced in step (d) contains 

only said change in said nucleic acid sequence. 

4. The method of Claim 1, wherein at least steps (b) 
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and (c) are repeated one or more times. 

5. The method of Claim 1, wherein said double- 
stranded nucleic acid is circular DNA, 

6. The method of Claim 1, wherein step (d) further 
5 comprises trearing said restricted nucleic acid molecule 

with a polymerase under conditions which create blunt ends. 

7. The method of Claim 1, wherein said host cells are 
bacteria . 

8 . The method of Claim 1 wherein said double stranded 
10 nucleic acid encodes polypeptide. 

9. The method of Claim 8, additionally comprising the 
step of expressing said polypeptide encoded by the nucleic 
acid of step ( e) . 

10. The method of Claim 1, wherein said calls are 
15 eukaryotic. 

11. The method of Claim 8 wherein said change is 
located within a polypeptide encoding region of the double- 
stranded nucleic acid. 

12. The method of Claim 8, wherein said change is 

2 0 located within a regulatory region of said double-stranded 

nucleic acid. 

13. The method of Claim 12, wherein said change is 
located within a promoter region of said double-stranded 
nucleic acid. 

25 14. The method of Claim 8, wherein said change is 

located within the enhancer region of said double-stranded 
nucleic acid. 

15. The method of Claim 1, wherein said double 
stranded nucleic acid comprises a viral vector . 

30 16. The method of Claim 15, wherein said compatible 

host cells comprise a helper virus packaging cell line that 
directs the packaging of viral particles containing said 
viral vector. 

17. The method of Claim 16, comprising the step of 
35 collecting said viral particles. 

18. The method of Claim 17, additionally comprising 
the step of infecting susceptible cells with said viral 
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particies . 

19. A recombinant: library created by the method of 
Claim l. 

20. A method for improving polypeptide expression from 
5 a double-stranded nucleic acid sequence encoding" polypeptide 

comprising: 

(a) measuring polypeptide expression from said 
double-stranded nucleic acid in a compatible host cell 

(b) providing a first primer population and a 

10 second primer population, each said population having a 

variable base composition at known positions along said 
primers, said primers incorporating a class IIS 
restriction enzyme recognition sequence, being capable 
of directing change in said nucleic acid sequence and 

15 being substantially complementary to said double- 

stranded nucleic acid to allow hybridization thereto; 

(c) hybridizing said first and second primer 
populations to opposite strands of said double stranded 
nucleic acid to form a first pair of primer-templates 

20 orientated in opposite directions; 

(d) performing enzymatic inverse polymerase chain 
reaction to generate at least one linear copy of said 
double stranded nucleic acid incorporating said change 
directed by said primers; 

25 (e) cutting said double stranded nucleic acid 

copy of step (d) with a class IIS . restriction enzyme to 
form a restricted linear nucleic acid molecule 
containing said change; 

(f) introducing said nucleic acid generated from 
3 0 step (d) or (e) into said host cells; 

(g) measuring polypeptide expression from said 
modified nucleic acid of step (f) in said cells; and 

(h) identifying cells with expression levels 
greater than the expression levels measured in step 

35 (a). 

21. The method of Claim 20, additionally comprising 
the step of joining termini of said restricted linear 
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nucleic acid of seep (e) to produce modified double-stranded 
circular nucleic acid. 

22. The method of Claim 20, additionally comprising 
the step of obtaining modified template from said identified 

5 cells. 

23. The method of Claim 22, comprising the step of 
identifying the modified nucleic acid sequence. 

24. The method of Claim 22, comprising transferring 
the modified sequence into another nucleic acid sequence. 

10 25. The method of Claim 21, wherein said primers 

direct changes in a promoter sequence. 

26. The method of Claim 21, wherein said primers 
direct changes in a polypeptide sequence. 

27. The method of Claim 21, wherein said compatible 
15 cells are bacteria. 

28. The method of Claim 21, wherein said cells are 
eukaryotes . 

29. The method of Claim 21, wherein said primers 
direct changes in a ribosome binding sequence. 

2 0 3 0. A method for generating a recombinant library 

using wobble-base mutagenesis comprising: 

(a) providing a first primer population and a 
second primer population, said primers being 
substantially complementary to a region of double 

25 stranded nucleic acid encoding polypeptide to allow 

hybridization thereto, said primers having a variable 
base composition in the third position of at least one 
nucleotide codon corresponding to said double stranded 
nucleic acid and a class IIS restriction enzyme 

3 0 recognition sequence; 

(b) hybridizing said first and second primer 
populations to opposite strands of said double stranded 
nucleic acid to form a first pair of primer-templates 
orientated in opposite directions; 

35 (c) performing enzymatic inverse polymerase- chain 

reaction to generate at least one linear copy of said 
double stranded nucleic acid incorporating said change 



WO 93/12257 



PCT/US92/10647 



-54- 

directed by said primers; 

(d) curbing said double stranded linear nucleic 
acid cf step (c) with a class IIS restriction enzyme to 
form restricted linear nucleic acid molecule containing 

5 said change; and 

(e) introducing nucleic acid generated from step 
(c) or (d) into compatible host cells. 

31. The method of Claim 3 0 r additionally comprising 
joining termini of said restricted linear nucleic acid of 

10 step (d) to produce double-stranded circular nucleic acid. 

32. The method of Claim 30, wherein said variable base 
ccdcns do net alter the corresponding amino acid sequence cf 
said polypeptide. 

33. The method of Claim 30, wherein said primers 
15 direct alterations in the leader sequence of said 

polypeptide . 

34. The method of Claim 30, wherein said host ceils 
are bacteria. 

35. The method of Claim 33, wherein said leader 

2 0 sequence is the bacterial OmpA protein leader sequence or a 

fragment thereof. 

36. The method of Claim 33, wherein said leader 
sequence is United to polynucleotide encoding light and 
heavy chain antibody fragments. 

25 3 7. An optimized OmpA protein leader: 

5 ' ATGAAAAAAACTGCAATTGCGATTGCTGTTGCTCTTGCTGGTTTCGCGACGGTAGCAC 
AGGCC 3', or an expression promoting fragment thereof. 

38. An optimized OmpA protein leader sequence: 
5 ' ATGAAAAAAACCGCGATCGCCATTGCTGTGGCGCTTGCCGGCTTTGCTACGGTGGCGC 

3 0 AGG 3 'or an expression promoting fragment thereof. 
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AMENDED CLAIMS 

[received by the International Bureau on IS June 1993 (18.06.93); 
original claims 2, 5, 6, 17, 18, 21, 23, 24, 31 and 33 deleted; 
original claims 1, 3, 7 and 27 amended; remaining claims renumbered and reordered (5 pages)] 

1. A method of gerj^i^ting a recombinant mutagenesis 
library by introducing at least one nucleotide change in at 
least one predetermined location of circular double-stranded 
DNA comprising: 

(a) providing a first primer population and a 
second primer population wherein said first and second 
primer populations each comprise in a 5 1 to 3 1 
orientation a class IIS restriction enzyme recognition 
sequence, at least one nucleotide change to be 
introduced into at least one predetermined location of 
circular double-stranded DNA, and a sequence 
substantially complementary to said double-stranded DNA; 

(b) hybridizing said first and second primer 
populations to opposite strands within the same region of 
said double-stranded DNA in the same reaction vessel; 

(c) performing at least one cycle of a polymerase 
chain reaction to produce double-stranded linear 
molecules terminating with said class IIS restriction 
enzyme recognition sequence and including said change to 
be introduced into at least one predetermined location of 
circular double-stranded DNA; 

(d) digesting said double-stranded linear molecules 
with at least one class IIS restriction enzyme to produce 
overhanging termini that are complementary to one other; 
and 

(e) ligating said overhanging termini to 
recircularize said double-stranded linear molecules 
containing said change in at least one predetermined 
location . 

(f) introducing said double-stranded molecules of 
step (e) into compatible host cells 

2. Recombinant mutagenesis libraries created by the 
method of Claim 1 . 

3. The method of Claim 1, wherein said digested linear 
molecules produced in step (d) contain only said change 
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introduced into said predetermined location within the primer 
populations of step (a) . 

4. The method of Claim 1, wherein said host cells are 

E. coll. 

5 5. The method of Claim 1, wherein said cells are 

eukaryotic. 

6. The method of Claim 1, wherein said double-stranded 
circular DNA encodes polypeptide. 

7. The method of Claim 6, additionally comprising the 
10 step of expressing said polypeptide encoded by the nucleic 

acid of step (e) . 

8. The method of Claim 6, wherein said change is 
located within the region of said double-stranded circular DNA 
encoding polypeptide. 

15 9. The method of Claim 6, wherein said change is 

located within a regulatory region of said double-stranded 
circular DNA. 

10. The method of Claim 9, wherein said change is 
located within a promoter region of said double-stranded 

2 0 circular DNA. 

11. The method of Claim 9, wherein said change is 
located within the enhancer region of said double- srr and ed 
circular DNA. 

12. The method of Claim 1, wherein said double-stranded 
25 circular DNA is a plasmid. 

13. The method of Claim 1, wherein said double-stranded 
circular DNA comprises a viral vector. 

14. The method of Claim 13, wherein said host cells 
comprise a helper virus packaging cell line that directs the 

3 0 packaging of viral particles containing said viral vector. 

15. A method for improving polypeptide expression from 
a DNA sequence encoding polypeptide contained in a double- 
stranded circular DNA molecule comprising: 

(a) measuring polypeptide expression from said 
3 5 double-stranded DNA in a compatible host cell; 
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(b) providing a first primer population and a 
second primer population wherein said first and second 
primer populations each comprise in a 5 1 to 3 ' 
orientation a class IIS restriction enzyme recognition 
sequence, at least one change to be introduced into at 
least one predetermined location of circular double- 
stranded DNA, and a sequence substantially complementary 
to said double-stranded DNA; 

(c) hybridizing said first and second primer 
populations to opposite strands within the same region of 
said DNA molecule in the same reaction vessel; 

(d) performing at least one cycle of a polymerase 
chain reaction to produce double-stranded linear 
molecules terminating in said class IIS restriction 
enzyme recognition sequence wherein said change is now 
located in said predetermined location of circular 
double-stranded DNA; 

(e) digesting said double-stranded linear DNA with 
at least one class IIS restriction enzyme to produce 
overhanging termini that are complementary to one other; 

(f) ligating said overhanging termini to 
recircularize said double-stranded linear molecules 
containing said change in said predetermined location; 

(g) introducing said nucleic acid generated from 
step (f) into said host cells; 

(h) measuring polypeptide expression from the 
nucleic acid of step (f) ; and 

(i) identifying cells with polypeptide expression 
levels greater than the expression levels measured in 
step (a) . 

16. The method of Claim 15, additionally comprising the 
step of retrieving double-stranded circular DNA molecules from 
the cells of step (i) and identifying said change in said 
predetermined location of circular double-stranded DNA. 

17. The method of Claim 15, wherein said primer 
populations direct change in a promoter sequence. 
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IS. The method cf Claim 15, wherein said primer 
populations didrect change in a polypeptide sequence. 

19. The method of Claim 15, wherein said primer 
DODuiations direct change in a ribosome binding sequence. 

20. The method of Claim 15 , wherein said host ceils are 

E. coli. 

21. The method of Claim 15, wherein said host ceils are 
euJcaryotes . 

22. A method for generating a recombinant library of 
double— stranded circular DNA molecules using wobble base 
mutagenesis comprising: 

(a) providing a first primer population and a 
second primer population, each of said primer population 
being substantially complementary to a region cf double- 
stranded DNA encoding polypeptide to allow hybridization 
thereto under standard polymerase chain reaction 
conditions, said primers having a class IIS restriction 
enzyme recognition sequence and at least one of said 
primer populations having a region containing a variable 
base composition in the third position of at least one 
nucleotide codon corresponding to a polypeptide encoding 
region contained in a double stranded circular DNA 
molecule, wherein said class IIS restriction enzyme 
recognition sequence is located 5 r to said region 
containing a variable base composition; 

(b) hybridizing said first and second primer 
populations to opposite strands of said double-stranded 
DNA to form a first pair of primer-templates orientated 
in opposite directions; 

(c) performing an enzymatic inverse polymerase 
chain reaction to generate at least one linear copy of 
said double-stranded circular DNA molecule wherein said 
DNA molecule incorporates said change in the third 
position cf at least one nucleotide codon; 

(d) digesting said double stranded linear DNA of 
step (c) with at least one class IIS restriction enzyme 
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to produce overhanging ternini that: are complementary to 
one another; 

r 

(e) ligating said overhanging termini to 
recircularize said double-stranded linear DNA; and 
5 (f ) introducing the DNA of step (e) into compatible 

host cells. 

23. The method of Claim 22, wherein said change in the 
third position of at least one nucleotide codon does not alter 
the corresponding amino acid sequence of said polypeptide. 
1° 24. The method of Claim 23, wherein said change is 

located in the leader sequence of said polypeptide. 

25. The method of Claim 22, wherein said host cells are 

E, coli. 

26. The method of Claim 25, wherein said leader sequence 
15 is derived from the bacterial OmpA protein leader sequence or 

a fragment thereof. 

27. The method of Claim 25, wherein said leader sequence 
is linked to polynucleotide encoding light and heavy chain 
antibody fragments. 

20 23. An optimized leader sequence according to the method 

of Claim 2 6 having the following nucleic acid sequence: 5 1 
ATGAAAAAAACTGCAATTGCGATTGCTGTTGCTCTTGCTGGTTTCGCGACGGTAGCACAG 
GCC 3 f , or an expression promoting fragment thereof. 

29. An optimized leader sequence according to the method 

25 of Claim 26 having the following nucleic acid sequence: 5 f 

ATGAAAAAAACCGCGATCGCCATTGCTGTGGCGCTTGCCGGCTTTGCTACGGTGGCGCAGG 
3', or an expression promoting fragment thereof. 
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A MEND ED CLAIMS 

|recei^cd by the International Bureau on 13 June 1993 (18.06.93); 
original claims 2, 5, tx l 7, 18, 21, 23. 24, 31 and 33 deleted; 
original claims 1, 3, 7 and 27 amended; remaining claims renumbered and reordered (5 pages)] 

1 . A method of gertdlf *ting a recombinant mutagenesis 
library by introducing at least one nucleotide change in at 

5 least one predetermined location of circular double-stranded 

DNA comprising: 

(a) providing a first primer population and a 
second primer population wherein said first and second 
primer populations each comprise in a 5 ' to 3 ' 
10 orientation a class IIS restriction enzyme recognition 

sequence, at least one nucleotide change to be 
introduced into at least one predetermined location of 
circular double-stranded DNA, and a sequence 
substantially complementary to said double-stranded DNA; 
15 (b) hybridizing said first and second primer 

populations to opposite strands within the same region of 
said double-stranded DNA in the same reaction vessel; 

(c) performing at least one cycle of a polymerase 
chain reaction to produce double-stranded linear 

20 molecules terminating with said class IIS restriction 

enzyme recognition sequence and including said change to 
be introduced into at least one predetermined location of 
circular double-stranded DNA; 

(d) digesting said double-stranded linear molecules 
25 with at least one class IIS restriction enzyme to produce 

overhanging termini that are complementary to one other; 
and 

(e) ligating said overhanging termini to 
recircularize said double-stranded linear molecules 

30 containing said change in at least one predetermined 

location. 

(f) introducing said double-stranded molecules of 
step (e) into compatible host cells 

2. Recombinant mutagenesis libraries created by the 
35 method of Claim 1. 

3. The method of Claim 1, wherein said digested linear 
molecules produced in step (d) contain only said change 
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introduced into said predetermined location within the primer 
populations of step (a) . 

4. The method of Claim 1, wherein said host cells are 

E. coli. 

5. The method of Claim 1, wherein said cells are 
eukaryotic. 

6. The method of Claim i, wherein said double-stranded 
circular DNA encodes polypeptide. 

7. The method of Claim 6, additionally comprising the 
step of expressing said polypeptide encoded by the nucleic 
acid of step (e) . 

8. The method of Claim 6, wherein said change is 
located within the region of said double-stranded circular DNA 
encoding polypeptide. 

9. The method of Claim 6, wherein said change is 
located within a regulatory region of said double-stranded 
circular DNA. 

10. The method of claim 9, wherein said change is 
located within a promoter region of said double-stranded 
circular DNA. 

11. The method of Claim 9, wherein said change is 
located within the enhancer region of said double-stranded 
circular DNA. 

12. The method of Claim 1, wherein said double-stranded 
circular DNA is a plasmid. 

13. The method of Claim 1, wherein said double-stranded 
circular DNA comprises a viral vector. 

14. The method of claim 13, wherein said host cells 
comprise a helper virus packaging cell line that directs the 
packaging of viral particles containing said viral vector. 

15. A method for improving polypeptide expression from 
a DNA sequence encoding polypeptide contained in a double- 
stranded circular DNA molecule comprising: 

(a) measuring polypeptide expression from said 
double-stranded DNA in a compatible host cell; 
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(b) providing a first primer population and a 
second primer population wherein said first and second 
primer populations each comprise in a 5 1 to 3 ■ 
orientation a class IIS restriction enzyme recognition 
sequence, at least one change to be introduced into at 
least one predetermined location of circular double- 
stranded DNA, and a sequence substantially complementary 
to said double- stranded DNA; 

(c) hybridizing said first and second primer 
populations to opposite strands vithin the same region of 
said DNA molecule in the same reaction vessel; 

(d) performing at least one cycle of a polymerase 
chain reaction to produce double-stranded linear 
molecules terminating in said class IIS restriction 
enzyme recognition sequence wherein said change is now 
located in said predetermined location of circular 
double-stranded DNA; 

(e) digesting said double-stranded linear DNA with 
at least one class IIS restriction enzyme to produce 
overhanging termini that are complementary to one other; 

(f) ligating said overhanging termini to 
recircularize said double-stranded linear molecules 
containing said change in said predetermined location; 

(g) introducing said nucleic acid generated from 
step (f) into said host cells; 

(h) measuring polypeptide expression from the 
nucleic acid of step (f ) ; and 

(i) identifying cells with polypeptide expression 
levels greater than the expression levels measured in 
step (a) . 

16. The method of Claim 15, additionally comprising the 
step of retrieving double-stranded circular DNA molecules from 
the cells of step (i) and identifying said change in said 
predetermined location of circular double-stranded DNA. 

17. The method of Claim 15, wherein said primer 
populations direct change in a promoter sequence. 
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13. The method of Claim 15, wherein said primer 
populations didrect change in a polypeptide sequence. 

IS. The method of Claim 15, wherein said primer 
populations direct change in a ribosome binding sequence. 
5 20. The method of Claim 15, wherein said host cells are 

E. coli . 

21. The method of Claim 15, wherein said host cells are 
eukaryotes . 

22. A method for generating a recombinant library of 
10 double-stranded circular DNA molecules using wobble base 

mutagenesis comprising: 

(a) providing a first primer population and a 
second primer population, each of said primer population 
being substantially complementary to a region of double- 
stranded DNA encoding polypeptide to allow hybridization 
thereto under standard polymerase chain reaction 
conditions, said primers having a class lis restriction 
enzyme recognition sequence and at least one of said 
primer populations having a region containing a variable 
base composition in the third position of at least one 
nucleotide codon corresponding to a polypeptide encoding 
region contained in a double stranded circular DNA 
molecule, wherein said class IIS restriction enzyme 
recognition sequence is located 5' to said region 
25 containing a variable base composition ; 

(b) hybridizing said first and second primer 
populations to opposite strands of said double-stranded 
DNA to form a first pair of primer-templates orientated 
in opposite directions; 
30 ( c ) performing an enzymatic inverse polymerase 

chain reaction to generate at least one linear copy of 
said double-stranded circular DNA molecule wherein said 
DNA molecule incorporates said change in the third 
position of at least one nucleotide codon; 
3d f d ) digesting said double stranded linear DNA of 

step (c) with at least one class IIS restriction enzyme 
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to produce overhanging termini that are complementary to 
one another ; 

( e) ligating said overhanging termini to 
recircularize said double-stranded linear DNA; and 
5 (f) introducing the DNA of step (e) into compatible 

host cells. 

23. The method of Claim 22, wherein said change in the 
third position of at least one nucleotide codon does not alter 
the corresponding amino acid sequence of said polypeptide. 
1° 24. The method of Claim 23, wherein said change is 

located in the leader sequence of said polypeptide. 

25. The method of Claim 22, wherein said host cells are 

E. coii, 

26. The method of Claim 25, wherein said leader sequence 
15 is derived from the bacterial OmpA protein leader sequence or 

a fragment thereof. 

27. The method of Claim 25, wherein said leader sequence 
is liiiked to polynucleotide encoding light and heavy chain 
antibody fragments. 

20 28. An optimized leader sequence according to the method 

of Claim 2 6 having the following nucleic acid sequence: 5' 
ATGAAAAAAACTGCAATTGCGATTGCTGTTGCTCTTGCTGGTTTCGCGACGGTAGCACAG 
GCC 3', or an expression promoting fragment thereof. 

29. An optimized leader sequence according to the method 

25 of Claim 26 having the following nucleic acid sequence: 5' 

ATGAAAAAAACCGCGATCGCCATTGCTGTGGCGCTTGCCGGCTTTGCTACGGTGGCGCAGG 
3', or an expression promoting fragment thereof. 



